Subscribe: The Trouble with Tribbles...
http://ptribble.blogspot.com/atom.xml
Added By: Feedage Forager Feedage Grade B rated
Language: English
Tags:
bit  boot  build  create  file  illumos  new  package  packages  rpool  system  thing  tribblix  version  work  zfs  zone 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: The Trouble with Tribbles...

The Trouble with Tribbles...





Updated: 2018-04-10T12:56:35.725+01:00

 



Selecting relay smarthosts and using SMTP AUTH on illumos

2017-11-06T19:47:35.230+00:00

A problem I looked at recently involved configuring a system to send (relay) email via a customer's own SMTP servers. There are 2 parts to this:Select the relay host depending on some conditionAuthenticate with the remote relay using SMTP AUTHSearch for SMTP AUTH with sendmail on illumos or Solaris, and you invariably end up with advice on how to build Cyrus SASL and sendmail from scratch.For example, Andrew has some good instructions.However, if you look at the sendmail we ship on illumos you'll find that it's already been built with SASLv2 support:# /usr/lib/sendmail -bt -d0.1 < /dev/nullVersion 8.14.4+Sun Compiled with: DNSMAP LDAPMAP LOG MAP_REGEX MATCHGECOS MILTER MIME7TO8        MIME8TO7 NAMED_BIND NDBM NETINET NETINET6 NETUNIX NEWDB NIS        PIPELINING SASLv2 SCANF STARTTLS TCPWRAPPERS USERDB        USE_LDAP_INIT XDEBUGAnd, if you telnet to port 25 and look at the EHLO response it includes:250-AUTH GSSAPI DIGEST-MD5 CRAM-MD5However, that's not actually the part we want here (but I'll come back to that later). I don't want to authenticate against my own server, I need my system to authenticate against a remote server.Back to the problem at hand.The first part - selecting the right smarthost - can be achieved using smarttable. All you need is the smarttable.m4 file, and then build a configuration using it by enabling the smarttable feature.The second part, SMTP AUTH, should also be very simple. Again, it's all documented, and just involves enabling the authinfo feature. But wait - on illumos, there is no authinfo.m4 file, so that won't work.In fact, it does. So what you need to do is to download the sendmail source, unpack it, and there in the cf/feature directory you'll find the authinfo.m4 file.OK, so copy both files - smarttable.m4 and authinfo.m4 - into the /etc/mail/cf/feature directory on a server. Copy and edit the sendmail.mc file (i'm going to copy it to /tmp and edit it there) to add the 2 feature lines, like this fragment of the file here:...define(`confFALLBACK_SMARTHOST', `mailhost$?m.$m$.')dnlFEATURE(`authinfo')dnlFEATURE(`smarttable')dnlMAILER(`local')dnl...Basically, just add the features above the MAILER line. Then compile that:cd /etc/mail/cf/cfm4 ../m4/cf.m4 /tmp/sendmail.mc > /tmp/sendmail.cfThat's your new sendmail.cf ready. It uses 2 databases in /etc/mail, to create these (initially empty):cd /etc/mailtouch smarttabletouch authinfomakemap hash smarttable < smarttablemakemap hash authinfo < authinfothen copy your new sendmail.cf into /etc/mail and restart sendmailcp /tmp/sendmail.cf /etc/mailsvcadm restart sendmailSo far so good, but what should those files look like?First the smarttable file, which is just a map of sender to relay host. For example, it might just have:my.name@gmail.com smtp.gmail.comWhich means that if I want my home system to send out mail with my address on it, it should route it through gmail's servers rather than trying to deliver it direct (and likely getting marked as spam).Then the authinfo file, which looks likeAuthinfo:smtp.gmail.com "U:root" "I:my.name@gmail.com" "P:mypassword" "M:LOGIN PLAIN"Authinfo:smtp.gmail.com:587 "U:root" "I:my.name@gmail.com" "P:mypassword" "M:LOGIN PLAIN"(There are just 2 lines there, starting with Authinfo:, even if the blog shows it wrapped.)Basically, for gmail, you need to supply your email address as the identifier and your password as, well, the password. (Note: if you've got two-factor authentication set up, you'll need to set up an app key.)Of course, the authinfo files ought to to readable only by root, otherwise anyone one your system can read your password in the clear.There are a couple of non-standard tweaks you'll need for gmail to work. First, you need to go to your gmail account settings and allow less secure apps. Second, you will need the "M:LOGIN PLAIN" entry in the authinfo file, else you'll get an "available mechanisms do not fulfill requirements" error back.Redo the two makemap commands above and you're good to g[...]



Building illumos-gate on AWS

2017-11-01T20:59:52.853+00:00

Having talked about running Tribblix on AWS, one of the things that would be quite neat would be to be able to build illumos-gate.This is interesting because it's a relatively involved process, and might require proper resources - it's not really possible to build illumos inside VirtualBox, for instance, and many laptops don't run illumos terribly well. So it's hard for the average user to put together a decent - most likely dedicated - rig capable of building or developing illumos, which is clearly a barrier to contribution.Here's how anyone can build illumos, using Tribblix.Build yourself an EC2 instance as documented here, with 2 changes:The instance type should be m4.large or bigger - m4.xlarge or c4.xlarge would be better. The bigger the instance, the quicker the build, but m4.large is pretty much the minimum size.Attach an EBS volume to the instance, at least 8G in size. If you want to do multiple builds, or do lint or debug builds, then it has to be larger. I attach the volume as /dev/sdf, which is assumed below. (You could keep the volume around to persist the data, of course.)Once booted, log in as root. You then need to set up the zfs pool (the disk showing up as c2t5d0 below matches the /dev/sdf attachment point) and create a couple of file systems that can be used to host the build zone and store the build.zpool create storage c2t5d0zfs set compression=lz4 storagezfs destroy rpool/export/homezfs create -o mountpoint=/export/home storage/homezfs create -o mountpoint=/export/zones storage/zonesYou should then do an update to ensure packages are up to date, and install the develop overlay to get you some useful tools.zap refreshzap update-overlay -azap install-overlay developThen create a user, which you're going to use to do the build. For me, that is:groupadd -g 10000 ituseradd -g it -u 11730 -c "Peter Tribble" -s /bin/tcsh \  -d /export/home/ptribble ptribblemkdir -p /export/home/ptribblechown -hR ptribble:it /export/home/ptribblepasswd ptribbleThen create a build zone. It has an IP address, just pick any unused private address (I simply use the address above that of the global zone, which you can get with ifconfig or from the AWS console - note that it's the private address, not the public IP that you ssh to).zap create-zone -z illumos-build -t whole \  -i 172.xxx.xxx.xxx -o develop \  -O java -O illumos-build -U ptribbleWhat does this do? It creates a new zone, called illumos-build. It's a whole root zone, with its own exclusive set of file systems. The IP address is 172.xxx.xxx.xxxx. The develop overlay is installed (in this case, copied from the global zone); the java and illumos-build overlays are added to this new zone (note the upper-case -O here). Finally, the user account ptribble is shared with the zone.Give that a few seconds to boot and log in to it, then a couple of tweaks that are necessary for illumos to build without errors.zlogin illumos-buildrm /usr/bin/cppcd /usr/bin ; ln -s ../gnu/bin/xgettext gxgettextNow log out and log back in to the instance as your new user. We're going to create somewhere to store the files, and check out the source code.mkdir Illumoscd Illumosgit clone git://github.com/illumos/illumos-gate.gitwget -c \  https://download.joyent.com/pub/build/illumos/on-closed-bins.i386.tar.bz2 \  https://download.joyent.com/pub/build/illumos/on-closed-bins-nd.i386.tar.bz2Now we set up the build.cd illumos-gatebzcat ../on-closed-bins.i386.tar.bz2 | tar xf -bzcat ../on-closed-bins-nd.i386.tar.bz2 | tar xf -cp usr/src/tools/scripts/nightly.sh .chmod +x nightly.shThere are two more files we need. Go to the tribblix-build repo and look in the illumos directory there. Grab one of the illumos.sh files from there and put it into your illumos-gate directory with the name illumos.sh. If you need to change how the build is done, this is the file to edit (but start from one of those files so you get one appropriate for Tribblix as the host). Also, grab Makefile.auditrecord and use it to replace usr/src/cmd/auditrecord/Makefile.Now log in to th[...]



Public Tribblix AMI now available

2017-10-31T19:12:20.071+00:00

There's now a public Tribblix AMI available to run on AWS.This was built according to the notes I gave earlier. And is part of making Tribblix the illumos for everyone.This is to be considered slightly experimental, and there are a couple of constraints:First, the AMI is only available in the London region for now (I'm in the UK, so that's where I'm running things). I could make it available elsewhere, but there are costs associated with doing so and, as everything related to Tribblix comes out of my own pocket, I'm not going to incur costs unless there's a demonstrable need. If you want to run in a different region, then you can always copy the AMI.Second, the size of the image is quite small. Again, there's a constraint on cost. But the idea here is that you wouldn't store any non-trivial data in the image itself - you would create an appropriately sized EBS volume, attach that and create a zfs pool for your data. The Tribblix repo server does just that - the package repo lives on the second pool.So, how to use this? I'm going to assume some level of AWS familiarity, that you have an account and know basically how to use AWS, and that your account is set up with things like an ssh key pair.Go to the AWS console, and navigate to the EC2 dashboard. Unless you've copied the AMI to the region of your choice, make sure you're working in London - the dropdown is in the top right:Then hit the launch instance button:Now you get to choose an Amazon Machine Image (AMI). Click on "Community AMIs" and enter "Tribblix" or "illumos" into the "Search community AMIs" search box. At the time of writing, you'll only get one result, but more may appear in future:OK, go and select that one. Then you can Choose an Instance Type. A great thing about Tribblix is that it's pretty lightweight, so the t2.micro - available on the free tier - is a good choice.Click on "Review and Launch". On the next screen you can edit the storage to add an additional volume, but the one thing you must do is edit the security group.If you leave it like that, you'll have no way to access it. So Edit it, and the simplest thing to do at this point is to create a new security group that allows ssh only, with the source being your own IP address, which you can get by selecting "My IP" from the source dropdown.(I've got a saved security group that does just that, to let me straight in from home.)Click on "Review and Launch" to go back to the main screen, and then "Launch". The is when you get to choose which key pair you can use to log in to your instance:It will take a little while to start (although it's usually ready before the status checks say so), and you should then be able to ssh in to it (as root, with the key pair you set up).ssh -i peter1-london.pem \root@ec2-35-176-237-204.eu-west-2.compute.amazonaws.comAnd you're good to go. What you do then is up to you; I'll cover some scenarios in upcoming posts. Be aware that the base AMI has a pretty minimalist set of packages installed, so you probably want to add some more packages or overlays to do anything useful.[...]



The commoditization of IT?

2017-09-20T19:28:51.799+01:00

IT, so the story goes, is now a boring commodity. But is this true?

Let's first define what a commodity is. There are a range of definitions we could use, but I'm going to think of a commodity as something that is functionally undifferentiated and available from multiple sources. The key aspect here is that of interchangeability (aka fungibility).

As an example, most computer components fall into the commodity category. Memory DIMMS, disk drives, network interfaces - you can (in principle) use any vendor's disk drives or memory and your computer will still work. You can use a mouse, keyboard, or monitor from any vendor and things will work just fine. Vendors have to differentiate in other ways - performance, cost, reliability, service.

What about smartphones? I would say that the phone piece is a commodity. Whether for a mobile or a land line, you can switch your telephone for another make or model, and you can switch from one telephony provider to another.

But the smart part of smartphones isn't properly interchangeable. You can't simply swap an Apple handset for an Android and carry on as you were; you have to switch everything to a different domain. And the suppliers here are keen to enforce differentiation and prevent interchangeability. We live in a world of proprietary walled gardens.

In most non-trivial cases, databases aren't commodities. Big database companies rely on the fact that you couldn't migrate to another database vendor even if you wanted to.

Operating systems are clearly differentiated. You can't swap Solaris for Windows, or either for Linux or BSD. You can't even treat different distributions as commodities if you restrict yourself to the Linux domain.

Although the operating system landscape is changing a little, in that Docker and containerization offer the prospect of interchangeability - you could, in theory, run a Docker image anywhere and on anything.

Cloud computing definitely isn't a commodity. (Thinking of it as a utility might be slightly more accurate.) Heck, there are sufficient differences over what's available that migrating between different AWS regions isn't smooth, let alone migrating between cloud providers.

Vendor lock-in is the big thing, and it's diametrically opposed to being a commodity - what vendor wants to make it easy for its customers to leave? (Despite that being one of the key attractions of any vendor in practice.)

One of the requirements for interchangeability is standardization, and there's a tension here between standardizing things (thereby making things the same) and innovation, which necessarily implies change. I could (and probably will at some point) go on at length about innovation, but I see precious little innovation in practice, more constant reinvention of the square wheel. Meanwhile the standards we have are either efforts like POSIX, which is largely codifying accidental implementations from the 1970s, or ad-hoc emergence of initial implementations that were cobbled together with little or no thought for actual suitability.

Rather than commoditization being a standard base, with a rising tide lifting all boats, any commoditization chips away the good stuff to leave the lowest common denominator, while everyone deliberately introduces incompatibilities in the name of differentiation.

So it seems to me that, far from being commoditized, IT has been monopolized and mediocritized.




Tribblix - illumos for everyone?

2017-09-10T21:55:07.215+01:00

When I was doing a bit of a branding exercise for Tribblix, part of which generated the rather amateurish logo I now have - something I needed to make some business cards and stickers to take to FOSDEM this year - one of the things I wondered about was a good tagline.

In the end, I ended up with "the retro illumos distribution". Of course, it was pointed out that illumos was retro enough on its own, so the idea of a retro variant was a bit unnecessary.

The other tagline I came up with was "illumos for everyone". I rejected it in the end because it was a bit preposterous - I'm not really building something for everyone.

Yet the underlying idea here was simple - that I would actively seek to build a distribution that was inclusive, not exclusive. That's why:

  • I have a SPARC version as well as x86
  • On x86, I support 32-bit as well as 64-bit systems
  • Tribblix is suitable for both desktop and server use
  • Tribblix is a flexible system, not an appliance or hypervisor
  • I work on ensuring Tribblix will work successfully on systems with more minimal resources than other distributions
  • A variety of installation methods are supported - media, network, iPXE
  • I've worked on installation in the cloud, both KVM-based and AWS, in addition to bare metal or other hypervisors
  • I've tried to make key features such as zones easier to configure and use

Generally, the idea is to reduce the barriers and limitations for installing and using Tribblix.

This summer, I came across a much better way of putting it. Rather than "illumos for everyone", a variation of the UK Labour Party's slogan expresses the idea much more elegantly. Tribblix would be "illumos for the many, not the few". It's a shame that the slogan is already taken, as it expresses the philosophical aim rather neatly.



Creating a decent Tribblix AMI

2017-08-03T18:45:56.469+01:00

Previously, I've described how I created my first Tribblix AMI, then how to do it properly in hvm mode so you can run on modern instances in all regions.That creates something that will work, but is it actually in a state that's useful?The first thing is to add an EC2 credential service. That's the thing that will query for metadata and install the keys on the system so you can log after the instance is created. I tried the ec2-credential service from OmniOS, but for some reason it didn't work right on Tribblix. I've tweaked mine a little, forcing it to run after the network comes up, adding retries in case there's a problem, and also disabling it in non-global zones.Of course, there's more instance metadata that I could query and use, but I haven't yet had a need for anything other than the initial key.The other thing I've been wondering about is the configuration of Tribblix itself - specifically what the storage should look like and what the default software installation should look like.My image is built on an 8G "disk" or EBS volume. That might seem a little small, but remember that Tribblix is pretty lean and mean. For a typical server configuration you'll probably be looking at about 1G or so, and that's without any special work. The most annoying thing here is that by default you lose 2G to each of dump and swap, so that's effectively half the disk gone. There's opportunity to modify those, especially as I'm typically using t2.micro instances on the free tier that only have 1G of memory. You might not even want dump at all. As for swap, you do want some (so that anonymous reservations don't eat into actual RAM) but you could cut that down a bit.As I'm writing this I do wonder whether I could pull some of the instance metadata and shrink the dump and swap volumes appropriately. The assumption I'm making here, though, is that if you're storing any reasonable amount of data that you're going to attach a separate EBS volume, and you can then size that appropriately to the need at hand. (And you can then move that data around independent of your running instance.) So I think that keeping the root volume fairly small is reasonable. It also keeps my AWS bill down, an important consideration as any charges here come out of my own pocket.Then, what should the baseline software install look like? Tribblix uses overlays, and there's an assumption that you always start from the base overlay. I'm currently using a dedicated overlay that pulls in cli-tools - essentially you get basic shells, compression tools, basic utilities, but not much else. Many of the normal server utilities don't apply to running in the cloud, as they're aimed at monitoring or managing hardware.The base set of packages is that installed on the ISO. That includes most storage and network drivers, which are irrelevant - on EC2 you know exactly which drivers you need, so almost all the drivers that are installed are unnecessary. What I need here is a better way of handling installation variants, so it knows the drivers aren't supposed to be there - at the moment I could remove them, but updates and upgrades would simply put them back. In the same vein, I could only ship a 64-bit kernel, as we know there are no 32-bit instance types available.At the moment I have an LX variant, which is a bit of a hack in terms of the way I've packaged it together, but as the number of interesting variants grows I'm going to have to come up with a better way of handling it, especially as you might want multiple variants together - for instance a 64-bit LX-enabled cloud-optimised image.[...]



Building a Tribblix AMI - hvm mode

2017-07-31T13:54:46.429+01:00

After having created a Tribblix AMI to prove that Tribblix basically works on EC2, I then moved on to the next issue - how to create an AMI that will run in hvm mode?As a reminder, pv mode AMIs are deprecated, aren't supported by all instance types, and don't work in all regions. So you really need something that runs in hvm mode.The first thought might be to convert the existing pv image to a hvm image. I've tried that and, while you can do the conversion, the image doesn't actually work. The problem here is that ZFS has the physical paths of the devices it's installed on embedded in the pool metadata. Changing from pv to hvm mode changes the emulated hardware, in particular the disk paths, so the ZFS pool isn't where it thought it was and the system panics. If you have a mismatch between the disk layout where the pool was created and where you're running you'll get a panic something like this:If you had console access and could boot from media you could fix this, but AWS doesn't provide that. (And if you could boot from media you could just do a regular install without all the shenanigans involved in producing an AMI.)So, you have to create the image on a system that looks like EC2. Which means using xen.Fortunately, this road has been travelled before. These instructions are exactly what you need. They're for OpenIndiana, but will apply to any illumos distribution. And they're the process used by the OpenZFS project to do their testing. (I'll also mention that the OpenZFS folks have put a number of fixes back into illumos that improve the EC2 experience for us.)I'm not going to repeat those instruction, that would be boring, so I'll talk about what I had to do or change to make those instructions work for me.I got one of my spare desktop PCs out and installed Ubuntu 16.04 on it. (I must be spoilt by Tribblix, the Ubuntu install was horrendously slow and very high maintenance.) And then installed xen, rebooted as dom0, and set up the bridge networking.That was my first pothole. There's this thing called systemd that's come along, and it changes the way network configuration is done. Much cussing and googling, but I got it right first time.Then I discover that there's a new toolstack here. It's all xl not xm, but otherwise seems the same.I then tried to start a VM, only to be given a completely meaningless and unhelpful error message. Why tell the user what's wrong when you can just vomit a stack trace?After a bit of head-scratching I worked out that the system didn't actually support hvm mode. If you run xl info and look for virt_caps, it should mention hvm. That's a bit odd, the sticker on the front of the box looks right.Manufacturers ship hardware with VT-x disabled in the BIOS, it appears. Into the BIOS we go, to find that the relevant settings are greyed out and you need a BIOS password to get into them. Open the box and start looking for jumpers. Fortunately I found a helpful article - the key here was the bit about the jumper being blue, little details like that make all the difference.OK, so having wiped the BIOS password, gone into the BIOS and enabled VT-x, I go back to xen. Looking at virt_caps now shows hvm, as it should, and my domain starts.The idea here is that you connect to the console with VNC. Easy enough, but by the time I had got my ssh tunnel set up and started up my VNC client, my VM had gone. I started it again, it starts booting just fine but then issues a few warnings and then a kernel panic. It's all over pretty quick.In order to catch what it said, I then used vnc2flv. Someone asked me about screen recorders a while back, and I suggested they did what they wanted to do in a vnc session and use vnc2flv to record it. But it's the same here. Once I had the session recorded I can watch the movie and pause it to see what errors it's spitting out.This, I think, is related to illumos bug 7186. It looks like we can't handle the network pr[...]



Running illumos on AWS - the first Tribblix AMI

2017-07-31T09:28:18.813+01:00

I've run Tribblix on all sorts of hardware - desktops, servers, even the occasional laptop. I've had success running it on some of the smaller cloud providers that allow you to install from a custom ISO, or iPXE, such as my adventures with Vultr.However, running on AWS has eluded me. You might wonder why you would want to, but the reality is that AWS is a huge player, with many people turning to it as their default (and often only) option. So giving everyone who uses AWS access to Tribblix would be a good thing, and would also offer an easy route for people who might want to play with Tribblix to do so.The first thing to realize is that AWS is not so much a single cloud as a set of independent clouds. Each region is independent, and has a different set of capabilities. For example, EFS is only available in a few regions. These differences can affect us.On AWS, there are 2 different types of guest. We have pv (the older, paravirtualized) and hvm (the newer, hardware assisted). Any given AMI (Amazon Machine Image) will only run as either a pv or a hvm guest. And some EC2 instance types are pv, others hvm. Newer regions (such as London) are exclusively hvm, so pv isn't an option.Building an AMI from scratch looked a little daunting, so I looked to see what other illumos distributions might have made AMIs available. If you go to the community AMI page when launching an instances, the only one you'll find is OmniOS. They even have a page explaining how it was done. The snag is that all their images are pv. For my first set of experiments then, I was operating in the Dublin region.The OmniOS AMI boots up just fine and works pretty much as you would expect. No problems there. How to get Tribblix running though?The answer lies in the beauty of ZFS and Boot Environments. The basic approach here is to take a running OmniOS image, create a new Boot Environment, install Tribblix into that Boot Environment, and make the Tribblix Boot Environment the one to boot from next time. Once I've successfully booted the Tribblix image, I can clean up and delete the original OmniOS files.One of the advantages of Tribblix is that I have my own installer. It's quite a bit simpler than some of the other distros, and thus much easier to mangle to do things in new environments. I decided to use the iPXE image as used in my Vultr experiment, because it was easy and I had it to hand. I then wrote a modified installer script (source here) called img_install that was based on my over_install script used to drop Tribblix into an existing ZFS pool. The difference is that the old over_install was run in the context of a Tribblix Live CD; the new img_install is run in the context of an alternative distro. The other thing in that script is that I don't do any boot loader fiddling - the pv instances have a special pv-grub, which I'm careful not to touch.(By the way, the same trick will work for other illumos distributions. You just need a source archive of some sort and a script to unpack it. For example, I have a script to unpack some of the ISO images in the tribblix-zones repo, which I use to create alien-root zones. It's the same idea of installing an image in a alternate path.) So all that was involved was to:Start up an OmniOS instance (a micro instance on the free tier works fine)Run the img_install script to create the alternate BEReboot, so you boot into TribblixDelete the old OmniOS BEFinish off the install and apply updatesThen you can do the normal create an image trick on the AWS console, and you have a nice shiny Tribblix AMI.That all worked out just beautifully. Tribblix runs on EC2 just fine.In the next article, I'll describe how to create a hvm AMI.[...]



Mucking around with IPv6 and illumos zones

2017-07-22T12:41:17.986+01:00

The world is running out of IPv4 addresses, and it's time to move to IPv6.I remember that story being told over and over at conferences in the mid 1990s. Yet, here we are in 2017 and while there has been progress, we're definitely not there yet.With zones, illumos (and Solaris) give you virtualized application environments (containers is the trendy term - we tend not to use that in the Solarish context because it got polluted by Sun marketing). Those environments (usually) need to be networked, so why not with IPv6.So here goes with a few notes on the subject.Shared-IP zonesWith zones, the original networking model was a shared-ip stack, where the zone is given a fully configured network that is just a virtual IP configured on an existing interface. All the setup is done in the global zone, which makes it very easy.(By the way, this was the cause of the limit of 8192 zones per system, because you can only have 8192 virtual addresses on a single physical interface.)And configuring an IPv4 address is just a case of adding a net section to the zone configuration:add netset physical=aggr1set address=172.18.1.172/24endIt's exactly the same for IPv6, the only interesting issue is what the IPv6 address would be. Let's start with the link-local address - the one that starts with fe80:: - as you will generally need that even if you don't have a routed IPv6 network. For a physical interface, the IPv6 address is usually derived from the MAC address. We can't use that one, because we're sharing the interface and the global zone has already grabbed it. So the convention here is to construct something from the IPv4 address. It's then guaranteed to be unique in a broadcast network, which is all that matters for a link-local address. So all we have to do is convert the IPv4 address to hex, for example with printf:printf "%x%x:%x%x\n" 172 18 1 172which gives ac12:1ac, so the link-local address would be configured as:add netset physical=aggr1set address=fe80::ac12:1ac/10endYou're pretty much done here, if you do that your global and non-global zones will be able to communicate using IPv6 on the local subnet.If you had a routable prefix, then the same scheme can be applied. Just put the fragment onto the end of your prefix.add netset physical=aggr1set address=XXXX:XXXX:XXXX:XXXX::ac12:1ac/64endOf course, if you're assigned specific IPv6 addresses then you can use those directly. The above scheme is pretty trivial to script, though (and it actually makes it fairly easy to keep your DNS zone files up to date too).Exclusive-IP zonesFor an exclusive-ip zone, you just hand over a network interface to a zone and let it go figure. So it can assign whatever addresses it likes.In particular, the zone can use the normal MAC address scheme to generate its IPv6 link-local address.Originally in older Solaris, you needed to use a genuine physical interface. Which limited you a little bit as there are only so many network cards you can jam into a server. OpenSolaris introduced full network virtualization in the form of Crossbow, so any illumos distribution or Solaris 11 can create fully virtualized network stacks and present those to zones in the same way.In Tribblix, I use zap to manage zones, and it takes care of creating the appropriate vnics and, if appropriate, etherstubs, and wiring things together. I also poke functional /etc/hostname.* and /etc/defaultrouter files into the zone so the networking at least gets configured when the zone boots. Adding IPv6 to the zone is simply a case of creating matching empty /etc/hostname6.* files (one for each vnic) and the IPv6 addresses will get autoconfigured.There's one wrinkle with exclusive-ip that deserves a whole section, that of restricting the zone to only using addresses that you've set.Restricting with allowed-addressRemember that an exclusive-ip zone can manage the network interface. S[...]



What gets into Tribblix?

2017-07-13T13:27:52.553+01:00

The software available for Tribblix is a bit of an eclectic mix. How do I choose what software to package?

There are actually a number of different reasons why you get a particular package.

The basics


Some packages are just basic,and you expect to find them. Much of the GNU stack comes in this way. And often things like Perl and Python are a foundational requirement for a lot of other tools.

What I want personally


There are a number of areas where I have specific interests - I'm a bit of a magpie when it comes to X11 window managers, for example. And I need to open office documents, so I had to get LibreOffice working. There a few games or emulators that I like. This also explains why some things might not be present too - I have no real interest in video or multimedia, for example, so that's an area with relatively little coverage.

Oh, that looks cool


I'm often interested in new stuff. (Even if it's actually old stuff that's just new to me.) So if I come across a piece of software and think "that looks cool" I'll often try and build and package it. If it works, fine, it ends up in the repository. Even if I might not end up doing anything with it, I've gone to the effort of making a package so it may as well stay there and somebody else might make use of it.

I have a $DAYJOB


Yes, I have a day job (a very good one, thank you very much), and it involves running applications on illumos. If I'm going to evaluate software I'll do it on Tribblix first. Building stuff on Tribblix is much easier than on, say, OmniOS - I have total control of the environment, and many more tools and prerequisite packages to give me a head start. So I can easily screen out any software that simply isn't going to work, and identify any patches or modifications necessary, before heading into the rather more constrained work environment.

Can you make X or Y or Z available


I get requests from users. I'll pretty much always at least try to add the software asked for - the fact that someone's bothered to ask indicates it might be useful, and I might find it interesting as well. This doesn't always work, of course, and I'll have to punt.

I got bored one day


Sometimes I get a bit of free time (no, this doesn't happen very often), and start looking for packages that might be worth adding. Sometimes this involves looking at other systems to see what they ship.

It's a prerequisite for something else


Dependency hell is a fact of life, so a lot of the time is actually spent building prerequisites. This is one reason for speculatively trying things out - it identifies prerequisites, and they're often going to be needed by other packages too. Although what you'll find is a number of packages with no obvious consumers, because the software I wanted didn't actually work. As I mentioned before, though, I'll keep those packages I built, and they might come in useful later.



Running LX zones with Tribblix

2017-07-06T21:38:59.834+01:00

I mentioned a few months ago a little project I had been working on - nicknamed omnitribblix, it's regular Tribblix with the illumos components coming from illumos-omnios (now via OmniOS Community Edition) rather than vanilla illumos-gate.One of the changes I made in the recent Milestone 20 update was to split out the release packages to give more flexibility.Thiis allowed me to release a micro update to Milestone 20 (imaginatively called m20.1 or update 1), which updates the illumos bits but shares the same main package repository as the main Milestone 20 release.And the other thing I can now do is build variant releases. So Tribblix has an LX variant!You can download the omnitribblix ISO image from the Tribblix download page. It installs, operates, and is packaged just like regular Tribblix. If you don't use LX zones, you probably wouldn't notice the difference.(It's versioned as m20lx.1 - the update 1 there means that it's a parallel release to the regular Tribblix Milestone 20 update 1.)You can also update to the LX variant from either the regular Milestone 20 or Milestone 20 update 1 releases, in the normal way. It's a micro update, or sidegrade perhaps, but uses the same upgrade process as regular upgrades.And, because of the magic of boot environments, if there's a problem you can roll back.Anyway, once you have omnitribblix installed, how do you create an LX zone? Very easily, in the same way you create and destroy other zones on Tribblix, using the zap utility.Before you can do that, though, you need a Linux image of some sort to install.I've been using the same images I use under Docker. So, for example, if I want Alpine then I would go:docker run alpine uname -aand then get the name of the containerdocker ps -aand then export that withdocker export romantic_galileo > alpine.tarThen copy the alpine.tar file to your omnitribblix system. If you want something a bit richer, then Ubuntu will work. But generally exporting a Docker container like this will work, and the image characteristics will be a good fit for a zone.And then all you do to create the zone is use zap, specifying that it's an lx brand and telling it where the tarball is:zap create-zone -z alpine -t lx \-x 10.0.2.99 -I /tmp/alpine.tarand just zlogin to it as normal.There are constraints around networking - you have to be exclusive-ip (the -x flag) and zap will create (and destroy) the vnic for you automatically. But the networking in the zone won't actually be configured. (While you specify the IP address in the command, that just tells zap how to configure the network plumbing and the vnic.) You'll have to log in to the zone and use the native tools to identify and configure the network, like so:/native/sbin/ifconfig -a/native/sbin/ifconfig znic0 inet 10.0.2.99 up/native/usr/sbin/route add net default 10.0.2.2And off you go. Sitting on an illumos box with all its goodness, with access to the wide variety of the Linux ecosystem at your fingertips.[...]



Tweaking binaries with elfedit

2017-06-18T17:52:19.553+01:00

On Solaris and illumos, you can inspect shared objects (binaries and libraries) with elfdump. In the most common case, you're simply looking for what shared libraries you're linked against, in which case it's elfdump -d (or, for those of us who were doing this years before elfdump came into existence, dump -Lv). For example:% elfdump -d /bin/trueDynamic Section:  .dynamic     index  tag                value       [0]  NEEDED            0x1d6               libc.so.1       [1]  INIT              0x8050d20           and it goes on a bit. But basically you're looking at the NEEDED lines to see which shared libraries you need. (The other field that's generally of interest for a shared library is the SONAME field.)However, you can go beyond this, and use elfedit to manipulate what's present here. You can essentially replicate the above with:elfedit -r -e dyn:dump /bin/trueHere the -r flag says read-only (we're just looking), and -e says execute the command that follows, which is dyn:dump - or just show the dynamic section.If you look around, you'll see that the classic example is to set the runpath (which you might see as RPATH or RUNPATH in the dump output). This was used to fix up binaries that had been built incorrectly, or where you've moved the libraries somewhere other than where the binary normally looks for them. Which might look like:elfedit -e 'dyn:runpath /my/local/lib' progThis is the first example in the man page, and the standard example wherever you look. (Note the quotes - that's a single command input to elfedit.)However, another common case I come across is where libtool has completely mangled the link so the full pathname of the library (at build time, no less) has been embedded in the binary (either in absolute or relative form). In other words, rather than the NEEDED section beinglibfoo.so.1it ends up being/home/ptribble/build/bar/.libs/libfoo.so.1With this sort of error, no amount of tinkering with RPATH is going to help the binary find the library. Fortunately, elfedit can help us here too.First you need to work out which element you want to modify. Back to elfedit again to dump out the structure% elfedit -r -e dyn:dump /bin/baz     index  tag                value       [0]  POSFLAG_1         0x1                 [ LAZY ]       [1]  NEEDED            0x8e2               /home/.../libfoo.so.1It might be further down, of course. But the entry we want to edit is index number 1. We can narrow down the output just to this element by using the -dynndx flag to the dyn:dump command, for exampleelfedit -r -e 'dyn:dump -dynndx 1' /bin/bazor, equivalently, using dyn:valueelfedit -r -e 'dyn:value -dynndx 1' /bin/bazAnd we can actually set the value as well. This requires the -s flag to set a string, but you end up with:elfedit -e 'dyn:value -dynndx -s 1 libfoo.so.1' /bin/bazand then if you use elfdump or elfedit or ldd to look at the binary, it should pick up the library correctly.This is really very simple [...]



On Tribblix Milestone 20

2017-06-09T19:46:28.078+01:00

Having released a new update for Tribblix, I thought I would add a little commentary on the progress that's being made and the direction things are going in.This goes beyond the rather dry release notes and list of what's changed.The big structural change is that the ISO has been built as a single root archive, rather than the old way with a split-off /usr that's lofi-mounted from a compressed image.The original reason for doing this (and I experimented with it a while ago) was to allow installation on systems without drivers for the device that you're booting from. This might be a system with only USB3 ports, or I've had problems with laptops where illumos doesn't recognize the CD drive. The boot loader (and BIOS) load the initial boot archive, so if you don't need to ever talk to the media device again you're in much better shape.While we now have USB3 support, this simplified boot is a good thing in any case, and it allows some neat tricks like iPXE boot.Another logical change is in the release mechanism itself. I've discussed the Tribblix package repositories before. The snag with the traditional repository layout was that the packages that defined a release were in the main Tribblix repository. So, every time I make a new release I end up having to create a whole new Tribblix repository. Every time I update the illumos packages, I needed a new Tribblix repository. Creating a new one isn't too bad; ongoing support for multiple repositories is a lot of unnecessary work.The way to fix this is to split out the packages (there are 3 of them) that define the properties of a release into their own separate repo. This allows at least 2 new possibilities:I can release updated illumos packages without spinning a whole new Tribblix release. It would still use the same upgrade mechanism, but the main Tribblix repo is shared and it's a much lighter release process.I could create variants or spins. For example, I could create a variant that has LX (see omnitribblix). This would just have a different set of illumos packages but shares everything else. Or I could build a 32-bit or 64-bit only distro.I haven't yet done either of those things, but it's going to happen.Behind the scenes I've been gradually working to get more packages - especially those that deliver libraries - built as both 32-bit and 64-bit.Tribblix is fairly clear that it will continue to support 32-bit and 64-bit hardware, at least for a while. (Whereas both OmniOS and OpenIndiana have effectively dropped 32-bit compatibility, mostly by neglect rather than design.) Of course, there is a reasonable amount of software now that's only 64-bit (anything built with go, for example, or OpenJDK 8), but there's a reasonable chance the people using 32-bit hardware aren't necessarily going to want the latest and greatest applications. (This isn't 100% true, by the way - sometime you have to interoperate with other facilities in the environment.) But eventually we're going to have to make a full 64-bit transition, and it would be good to be ready.That gives a rough idea of the work that's currently underway. Looking ahead, there are a whole long list of packages that need adding or updating (such is a maintainer's life). The one significant place I have been falling behind is that I haven't updated gcc, so that needs work. And, of course, I'm trying to get SPARC into some sort of reasonable shape. But, overall, Tribblix is now pretty solid and a bit more polish and attention to detail would benefit it greatly.[...]



Installing Tribblix on Vultr using iPXE

2017-06-07T16:13:23.518+01:00

One of the new features in Tribblix 0m20 is that booting and installing using iPXE now works.Here's an example of using this functionality to install a server running Tribblix in the Vultr cloud. A similar mechanism ought to work for any other provider that allows iPXE boot.I'm assuming you have signed up and logged in, then go to deploy a server.First choose where you want to deploy the server. I'm in the UK, so London is a good choice.Then the critical bit, selecting the Server Type. The bit you want here is in a slightly confusing location, under the "Upload ISO" tab. But then select the "iPXE" radio button and put in the value http://pkgs.tribblix.org/m20/ipxe.txtThe other key option is Server Size. As with many providers, there's a simple scale. For testing, an instance with 1G of memory is more than adequate.The deploy it. After a few seconds of installing you can then click the link to manage the server, and then view the console, which uses VNC.If you're reasonably quick you get to see the initial iPXE screen, and can see it downloading the images:What you can see here is that it's downloaded the original ipxe script we specified. This looks like:#!ipxedhcpkernel /m20/platform/i86pc/kernel/amd64/unixinitrd /m20/platform/i86pc/boot_archiveboot Which just says to set up the network using dhcp (this might have already been done, but if you're booting off an ipxe iso it may not have been, so we do it anyway), then download the kernel and the boot archive, then boot from what you've just downloaded.The kernel and the boot archive are on the iso, I've just unpacked them on the server (so the URL given above for the ipxe script will be reasonably permanent for anybody to use). The only slight tweak I've had to make is that the original boot archive is actually gzip compressed and iPXE can't handle that, so it's been uncompressed. The boot archive also now contains the /usr file system as well, rather than it being split off as before. While I'm sure you could mangle the system to download it and sort things out, it's so much easier to put it inside the boot archive.Then you get into the normal installer, so log in as jack, su to root, and see what disk(s) are available using the new diskinfo tool. Then you can install Tribblix to that disk:Don't bother adding additional overlays at this point. It won't work - and you'll get an error about not being able to install overlays (you'll get the error anyway because the installer always tries to add some packages that aren't needed in the live environment). This will be fixed in a future update, but it's relatively harmless.The other thing you should do before the installation is to change the passwords for root and jack. If you change them before running the installer than the change will propagate to the installed system (because all it's doing is a copy). You really don't want the system to boot up wide open to the internet with the default (and well known) passwords.Once the (pretty quick) install finishes, it'll look like this:That's just like a normal install, other than the missing overlays. Then just reboot and you'll soon see the new loader, followed by the system booting.Due to the missing overlays, you'll get an error about the intrd service failing. You'll have to log in (ssh will work at this point) and then add at least the base overlay:zap install-overlay basePlus whatever other overlays you might want. Then you can clear the intrd service and you're good to go.[...]



Tribblix memory requirements

2017-06-02T20:18:46.038+01:00

Compared to the other illumos distributions, Tribblix has lower memory requirements.

I'm not talking about crazy stunts like running in 48M; here I'm talking about running a fully fledged system.

I've been doing a bit of testing of the upcoming release, which includes running the install under a range of configurations. The test here is to boot the ISO image in VirtualBox with a range of memory sizes and then install the kitchen sink.
  • The live image won't boot at all on a 256M system
  • The live image will boot on a 512M system, but installing to zfs will fail
  • However, installing to ufs works on a 512M system
  • With 768M, installation to zfs is rather slow
  • With 1G or more, you're fine
The upcoming release is going to be built slightly differently, in that it's no longer a split-off /usr configuration. (I discussed how that worked and those strange zlib files some time ago.) The latest OmniOS is a single image; SmartOS likewise. It's just so much easier to construct, and far more reliable.

That change explains the 256M failure - the ramdisk is about 300M, so it simply won't fit. It's likely to have an impact on the 512M case too - in the old scenario you only paged in the bits of the /usr filesystem as and if you needed them, now it's locked into memory.

On a limited memory system there's a way to make things a bit easier. Simply install the base (no additional overlays) from the installer, then add the rest of the overlays and packages later. The point here is that running from disk doesn't lock up anywhere near as much memory as the full OS being resident in RAM does. And some of the packages in the kitchen sink are rather large, which causes problems.

Once you've got Tribblix installed, how well does it cope? Surprisingly well, to be honest. The Xfce desktop runs quite well in either 512M or 768M of memory. I can run firefox on the 768M system without too many problems (given the way it consumes memory, probably not for a long intensive browsing session), while firefox on a 512M system does run, but it's clearly starting to grind. Java applications work, some smaller ones at least. You need to be realistic in your expectations, but the point is that smaller systems do work.

The most limited systems would tend to be older, possibly 32-bit hardware. I could build a 32-bit only image which would be quite a bit smaller - maybe only two-thirds the size. (And if you really wanted to you could get it even smaller - but then you're in the realms of building custom images using mvi or the like.)

However, the aim of keeping Tribblix viable on smallish systems isn't just to allow the use of old hardware, beneficial though that is. If you're running a service on a cloud or hosting provider then being able to use a 1G server instead of a 2G server will halve your costs, and that's a very good thing to be able to do.



Tribblix SPARC progress

2017-05-29T14:59:25.973+01:00

Tribblix is one of the relatively few illumos distributions that runs on both SPARC and x86 hardware.There are valid reasons for the lack of SPARC support in other distributions. For those backed by commercial entities, it makes no sense to support SPARC as they don't have paying customers to foot the bill. Which leaves SPARC support firmly in the hobbyist realm.Even in Tribblix, SPARC support has lagged the x86 version somewhat. Again, for entirely predictable reasons. While I do have SPARC hardware, it's relatively slow, noisy, power hungry, and heat-producing compared to my regular x86 boxes. And my day to day use is my x86 workstation, so that drives a lot of the desktop work.But SPARC development of Tribblix hasn't stopped. Far from it, it's just naturally slower.The current download ISO image at this time is still Milestone 16. Just to clarify the versioning here - that means it was built from exactly the same illumos commit as the corresponding x86 release. Because it took a little longer to get ready, the userland packages (such as they were) tended to be a bit newer.There have been 3 more Tribblix release on x86 since then. Over the winter (when it was cold and the heat output from the T5140 I use as a build server was a good thing) I tried building updated illumos versions. The T5140 I'm using to do the builds is running a cobbled-together frankendistro of bits of Tribblix, bits of OpenSXCE, some random bits from other people working on SPARC, and a whole lot of elbow grease. I managed to build illumos at the m17 and m18 release points, but m19 was a step too far (some of the native stuff assumes that the host OS isn't terribly antiquated). What this means is that I need to replace that by a current system, and get a properly self-hosting illumos build.That modernizes the underlying illumos components a bit. What about the rest of the system? The primary effort there was to replace the old core components that had been been borrowed from OpenSXCE while bootstrapping the distribution in the first place with native packages (and that are then up to date and match the x86 build). Some of the components here are pretty crucial - zlib and libxml2, for instance. At one point I messed up libxml2 slightly - not enough to kill SMF (which would be a big worry) but enough to stop zones working (which, apart from indicating that I had broken it, also left me without an imortant test mechanism). Rebuild everything enough times and the problem eventually cleared.I also had a go at getting my SunBlade 1500 workstation working. It's not terribly quick, but it's quiet enough and sufficiently low power that I can have it running without negatively impacting the home office. That was a bit of a struggle, the bge network driver currently in illumos doesn't work - I assume I'm seeing bug 7746 here, but the solution - to use an older version of the driver - works well enough. With that box available I not only have more testing available but also a lightweight machine that I can use to keep the package backlog under control.Graphics on SPARC is an interesting problem. OK, so I don't expect this to be a priority, but it would be nice to have something that worked. The first problem I found (a while ago) was that some of the binary graphics drivers wouldn't work at all. For example, the m64 driver (which is what might drive the graphics in my SunBlade 2000) uses hat_getkpfnum which was removed from illumos courtesy of bug 536. Graphics drivers that load often simply don't work, and getting an X server to start is a bit of a nightmare. After far too much manual fiddling I did manage to get a twm desktop running on the aforementioned [...]



OmniTribblix

2017-04-29T11:47:52.963+01:00

In Tribblix, it's a basic principle that I ship upstream software unmodified. I don't impose my own views on installation layout, nor do I customize it. Generally, I apply patches only to make stuff compile.

This means that what you see in Tribblix is exactly what the upstream author intended, and not some distro-specific bastardization of it.

It also makes my life easier, I don't have to maintain patches, and updating software is much easier if it's unmodified.

In particular, I use an absolutely vanilla illumos-gate. (For a long time it differed only in that I had the fix for 5188 applied, relevant because Tribblix actually uses SVR4 packaging, but now that's integrated I don't even need to do that.)

Again, this makes my life easier. (When you're maintaining a distro on your own in your spare time, making decisions that simplify your job is essential.)

But it also has another benefit: because I have no "special" features that I've added, I'm not tied to one particular version or variant or commit of illumos. Any version of illumos-gate will do just fine. When it comes time to make a release, I just clone the gate, build, and go.

What I could do, then, is build an instance of Tribblix atop some other fork of the gate. For example, illumos-omnios.

I did just that, built the gate (it needed a couple of changes to Makefiles because of the way that perl and snmp are slightly different in OmniOS than it is in Tribblix), created packages, built an ISO, booted and installed it in VirtualBox.

As expected, it just works.

But just demonstrating that it works isn't really the reason I wanted to do this. What I'm really after is the LX brand, which has been integrated into current OmniOS.

Installing an LX zone requires a Linux image. The original (Joyent) work was for their own deployment mechanism, using ZFS images. As soon as it was available in OmniOS the first thing I did was use tarballs, which OmniOS now supports. The easiest way to create a Linux image is to create a Docker container the way you like it, and then export it to a tarball. I did that for Alpine and installed a zone based on that.

Then you can do very simple things like:

# zlogin lx1 /bin/uname -a 
Linux lx1 4.4 BrandZ virtual linux x86_64 Linux

It's an attractive idea to simply use this as the base for the next Tribblix release. However, that requires illumos-omnios to be supported in the long term, which is currently at risk.




Noisy Tribblix

2017-04-12T20:10:37.371+01:00

I've had a couple of Tribblix users ask me why audio doesn't work.This was something I had noticed myself, and the reason was not that audio was in some way broken, but that the permissions on the audio devices were wrong - owned and only writeable by root.Now I only wanted to actually get any audio out on fairly rare occasions, so a quick chown wasn't that much of an imposition. But it obviously needed fixing properly.My assumption here is that most desktop users will be logging in through the SLiM login manager. So all I need to do is fix the permissions just before it calls setuid() to the logged in user. And then reset them back once the user is done.Now, I could have made up a bunch of chowns myself, or written a helper. There's actually code in SLiM to call ConsoleKit - but I don't have ConsoleKit, and don't really see the need to maintain a port of it just for this.But illumos already has the capability to do this, and the normal login mechanisms use it. There's code in libdevinfo that sets the permissions according to the rules laid out in the /etc/logindevperm file. So the code is really just a call to di_devperm_login() and di_devperm_logout(), and all is well.This also fixed another irritating bug - I can now eject memory sticks as myself, without needing to be root.The next thing that happens, of course, is that it doesn't take very long to realise that Twitter has a lot of videos that play automatically. So I'm sitting there and I can hear either the internal loudspeaker or my headphones warbling away.So the next thing I need is a way to shut the thing up. Historically, I used the old CDE sdtaudiocontrol, which was pretty good. (In general, I detested CDE as a desktop, the mailer and calendar were decent enough for their time, and the audio control was the only other thing I used much.) I use Xfce as my desktop, it used to have xfce4-mixer but that's now unmaintained and deprecated (and I removed that as part of the migration from gstreamer-0.10 to gstreamer1). Which pretty much leaves the command line audio utilities in illumos, specifically audioctl. I've added the package so users who update will automatically get that as well.The commandaudioctl set-control volume 0silences things, whileaudioctl set-control volume 75puts the volume back to normal. I've created aliases mute and unmute for those. A more sophisticated approach would be to save the volume and restore it afterwards, but this is enough for now.[...]



How old are illumos man pages?

2017-03-12T14:17:55.765+00:00

I've recently been looking at improving the state of the illumos man pages.

One thing you'll notice is that the date on some of the manpages is old - really old, some of them are dated 1990. That presumably means that they haven't been modified in any meaningful sense for a quarter of a century.

(By date here I mean the date displayed by the man command. Which isn't necessarily the date somebody last touched it, but should correspond to the last meaningful change.)

The distribution of dates looks like this:



As you can see, the dates go all the way back to 1990. There's not just the odd one or two either, there are a decent number of man pages that comes from the 1990s.

There are some obvious features in the chart above.

There's a noticeable spike in 1996, which of course significantly predates illumos or OpenSolaris. It's not entirely obvious why there should be a spike, but 100 of those man pages are related to libcurses.

I suspect the dip in 2005 is a result of the launch of Solaris 10, when everyone had a bit of a breather before development kicked into full gear again.

Then there's a drop in 2010. In fact, there's just a single page for 2010. That's when the OpenSolaris project closed, so there was little work done at that point. Also, our man pages were only integrated into illumos-gate in early 2011 (prior to which they were kept separate), and it's taken a while for man page updates to pick up again.

Of course, one reason the man pages are so old is that the software they're documenting is old. That's not necessarily a bad thing, if it ain't broke don't fix it as they say, but there is a certain amount of total rubbish that we ought to clear out.



Creating a Tribblix package repository

2017-02-02T21:11:43.914+00:00

I've previously described how Tribblix packages are built.The output of that process should be a zap file, which is an SVR4 package in filesystem format, zipped up. The file naming convention isPKG_NAME.VERSION.zapWhere PKG_NAME is the SVR4 name of the package (you can define aliases to be more user-friendly), VERSION is obvious - but has to match the installed version as shown, for example, by 'pkginfo -x'.You can install those packages directly. There's a little helper /usr/lib/zap/instzap that will automate the process of unpacking the zap file and running pkgadd on it, or you can do it by hand.You don't have to use my tooling. If you've got a scheme for building SVR4 packages already, then you could simply convert those.However, what you would really want to do is stick the packages in a repository somewhere, so they can be accessed using the normal zap commands.In Tribblix, a repository is just a web-accessible location that contains the zap files and minimal metadata. The metadata is the catalog and a list of aliases.The aliases file contains lines with two fields separated by a vertical pipe "|". The first field is the friendly alias, the second the real name of the package. If you want multiple aliases, just add multiple lines to the file.The catalog file contains lines with 5 fields separated by a vertical pipe "|".The first field is the package name, the second the current version, the third a space-separated list of packages this package depends on, the fourth the size of the file in bytes, the fifth is an MD5 checksum of the file. There's a trailing "|" in this case to terminate the line. The size and checksum are used to verify the download was successful and didn't corrupt the file. If you want to be sure that it's actually the package it claims to be then packages can also be signed.Here's an example line in the catalog:TRIBabook|0.5.6.0|TRIBreadline|66923|d98f77cfb3e92ee6495e3902cc46486f|So the TRIBabook package has a current version of 0.5.6.0, depends on the TRIBreadline package, and has the given size and checksum. It's in the file called TRIBabook.0.5.6.0.zap.The build repo has scripts that create the catalog and aliases files, which I use for convenience. They do make some assumptions about my package build pipeline, so it might be easier to manage the files by hand.So, having got a nice place on the web that has your packages and a catalog and some aliases, how does Tribblix know to use it?Assuming your repository is called myrepo, you need to add a file /etc/zap/repositories/myrepo.repo, containing something like the following (which is the configuration for the main tribblix repo in the milestone 18 release)NAME=tribblixDESC=Tribblix packagesURL=http://pkgs.tribblix.org/tribblix-m18/SIGNED=tribblixIf you aren't signing packages (and how to add the keys to the client is an exercise for the reader) then omit the SIGNED line. The NAME in this case would be myrepo, the DESC is whatever you make it, and the URL points at the directory containing the files.That's almost it. The other thing you need to do is add the repository to the list of repos that zap will search, which is held in the /etc/zap/repo.list file. By default, this contains100 tribblix200 illumos300 oiThat's a simple list, and the number is a priority, lower numbers have higher priority - they're searched first.(What the priority scheme means is that if you have a package with the same name in multiple repos, the one in the highest priority repo is the one that will be used. For example, the ssh packages used to come from illumos[...]



Package versions in Tribblix

2017-01-16T20:53:19.109+00:00

All packages in Tribblix are versioned. If you look at the pages on the package repository you can see the current version of each package in the repo. On an installed system the pkginfo -x command will give you the package description and version.As Tribblix is created from different sources, the meaning of the package version can vary.For illumos packages, the version string matches the Tribblix release. For example, "0.18.0" indicate the Milestone 18 (0m18) release.For packages inherited from another distro, the version matches in some way the distro release I got the packages from. For example, the OpenIndiana packages were (at this time) cut from the oi151a9 release, and have a version string "0.9o".For packages I build directly from source, the version string is usually the upstream version, with a build number appended. Initially the build number is 0, then increments. If the upstream version is updated, the build number goes back to zero. So it's reasonably obvious what version of a package is installed.For example, abiword is version 2.8.6 so the first time it was built the package version was 2.8.6.0. Over time the package has needed to be rebuilt, so it's now up to version 2.8.6.4.The sharp-eyed will notice that the illumos packages have a build number in them. This hasn't yet been used, it's there just in case.The scheme is reasonably flexible. For example, OpenSSL has letters in its releases - like 1.0.2j - which I could keep verbatim, but in practice I convert the letter to a numeric sub-version, hence 1.0.2.10.There are some packages for which I originally forgot to add the build number. That OpenSSL package is an example, but there are others. I've tended not to correct those as it disturbs the flow, I will if it ever becomes convenient.Some releases have a date, this is just converted to numerical form.One thing that should be obvious is that the scheme doesn't guarantee that package versions are numeric. They're just strings; it just happens that most packages have version numbers that are numeric or can easily be represented as such.Also, package versions don't necessarily increase, there is no sense of ordering built into versioning. For example (this does happen) there's an upstream version 1.2, which leads to package versions 1.2.0, 1.2.1, 1.2.2, etc. Then there's an upstream 1.2.1, which is packaged as version 1.2.1.0, which is lower than 1.2.2. And sometimes upstreams try a major version bump, then backtrack.However, package management in Tribblix ascribes no meaning to the version numbers. It's only test for currency is this - does the version installed match the version in the repository catalog? If they're the same, then you're up to date. If not, then apply the version from the repo.This then makes it easy to roll back errant packages. All I have to do is put the old version back in the catalog. Anyone who has applied the broken version will get a version mismatch and the older version will get installed whenever they update.(This simplistic approach only works if I haven't built anything against the newer version of the package I want to roll back. But then, all I have to do is roll all those dependent packages back as well.)Life's a little more complicated if you might want multiple versions of an application installed. In that case you have to have different packages. For example, I have separate packages for Python 2.7 and 3.6, and there might be 2 corresponding packages for any modules. I used to use multiple packages more extensively, sometimes even f[...]



Minimal illumos zones

2016-12-27T19:15:26.141+00:00

Zones, meet MVI. MVI, meet Zones.In Tribblix, zones can be the traditional Solaris sparse-root or whole-root style, or variations such as partial-root or alien-root. There's also the option to boot a blank zone - one in which nothing (or as close to nothing as possible) is running.In parallel development, minimal viable illumos allows you to boot illumos in 48M of RAM, or to build single purpose bootable images.So what happens if you combine these strands of thought? Minimal illumos zones, that's what.The general idea here is that you can use the (new) zmvix.sh script in mvi to build a tarball containing a filesystem image. This image is designed for use in zones, so contains none of the kernel components. And there's no point building an ISO image, as it never needs to be bootable of itself.The alien-root brand in Tribblix was originally designed to build a zone from an installation ISO. The minimal zone is a similar concept, although quite a bit simpler. Unpacking a tarball is far more direct that dissecting a bootable ISO. Furthermore, it's not necessary to undo the live media customizations present on an installation image. So the zone installer just has a simple branch to a tarball unpacker or the iso unpacker depending on filename.The whole premise of mvi is that it's minimal. However, what counts as the bare minimum depends on context.For example, a zone whose networking is provided via a shared-ip stack has no need for networking tools, as all the networking is configured for it by the global zone. So that's a major potential simplification.On the other hand, getting zlogin working was a bit of a challenge. The first problem is that you need getent to be present in the zone. This is defined by the user_cmd element of the zone brand's config.xml file. So my zmvix.sh script explicitly adds /usr/bin/getent to the image. That's enough to get zlogin -S to work.A full zlogin is a bit more work. That calls /usr/bin/login, which has a bunch more dependencies, including a number of pam modules. The list of files needed a bit of trial and error to obtain. So you can make a full zlogin work, but you don't need to.While I was doing this I had a look through the zlogin source, and to say it's a massive kludge is a bit of an understatement. And when I read comments like:It's truly amazing that there is no library function in OpenSolaris to do this for us.Then I get alarmed. There is truly weird stuff going on here, and I'm clearly not supposed to understand it.The result of all this, if you create an image from mvi with the command:./zmvix.sh nonet nodethen you end up with an 11M image file, which I can use to create a zone withzap create-zone -z zmvi -t alien \-I /var/tmp/zmvi.tar.gz \-i 192.168.1.234and if you point a browser at the zone's IP address, port 8000, you get back the page from the node server.You can do this yourself if you check out mvi and are running a fully updated Tribblix m18.In all, there are 4 processes associated with the zone. There's zsched, init, and the console shell, plus node. That's it.Of course, this isn't the only way to do it. Another option would be to use the partial-root zone installer and get it to construct the zone's filesystem image the same way that mvi does, bypassing the tarball creation and unpacking.[...]



Tribblix and the new illumos loader

2016-12-04T20:53:56.944+00:00

Recently, a new boot loader was added to illumos, which will in time replace the old and venerable grub that we've been using for about a decade.I've been looking at how this will impact Tribblix.The boot loader's arrival was heralded long in advance. I actually released Tribblix milestone 18 when I did to ensure I didn't have to deal with any loader issues. Not that I was expecting any issues, but just in case.The first step in looking at the impact of the new loader was to build a current copy of illumos. I had a couple of issues due to recent illumos changes. The first being that the transition to Python 2.7 didn't work with my copy of python (I need to build a dual 32/64 bit installation) so I used the old copy of python 2.6. The second was that the loader wants /usr/sfw/bin/gstrip, which I've never had, but a quick symlink set that straight.The loader is a new package. The first thing I tried was to build an ISO exactly as I always have. This ISO knows nothing about the new loader, doesn't have the loader present, and uses grub just as it always has. If you pretend the new loader doesn't exist, everything just works the way it did before. That's encouraging as a fallback positionNext step was to add the package for the new loader, and persuade the ISO to boot from it. This was very easy, you just need to change the path to the boot image when calling mkisofs. For grub, it was-b boot/grub/stage2_eltoritoand for the new loader it becomes-b boot/cdbootThat should be it, but it then tripped up on a Tribblix customization. The loader needs to know where the kernel and the boot archive are. The defaults are reasonable, but use $ISADIR to pick up a 32 or 64-bit image as required. On live media, Tribblix has a single merged boot archive, so I need to override the boot_archive_name to not use $ISADIR. So I create a file /boot/loader.conf.local that containsboot_archive_load="YES"boot_archive_type="rootfs"boot_archive_name="/platform/i86pc/boot_archive"boot_archive.hash_load="NO"boot_archive.hash_type="hash"boot_archive.hash_name="/platform/i86pc/${ISADIR}/boot_archive.hash"and then make sure that I delete that file on the installed image, where things will look like a regular system again.Thinking about this, it would have been more sensible to drop a file into /boot/conf.d which is another location that the loader uses for customization. I use this for something else, I create a file /boot/conf.d/chaindisk containingchain_disk="disk0:"and the loader menu will have a "boot from hard disk" entry, which I think you do need on media. Again, this gets deleted from the installed system where it doesn't make any sense.Something else you can do is tweak the branding. I've played with changing the illumos name on the boot screen with Tribblix (look at the ascii art in /boot/forth/brand-illumos.4th for example).To make the installed system bootable used to involve messing with installgrub, now bootadm can manage it for you. That's just/sbin/bootadm install-bootloader -M -P rpooland it should handle pools with multiple drives correctly.The only other thing the installer needs to do, as far as I can tell, is initialize the list of boot environments. This is similar to grub, and involves putting 2 lines into /rpool/boot/menu.lst, for exampletitle Tribblix 0.19bootfs rpool/ROOT/tribblixand there you are. Some relatively simple changes and Tribblix is ready to use the new loader.Well, almost. This needs to be packaged up and polished, and I still ne[...]



zfs receive oddity

2016-10-09T21:03:13.088+01:00

Every so often, even a system as good as zfs will throw you a curveball. This one threw me for a while, and here's a simplified example.All I'm trying to do here is replicate one file system. So I create it, touch a file so I know it's made it.zfs create -o rpool/t1touch /rpool/t1/1OK, snapshot it and send it.zfs snapshot rpool/t1@t1s1zfs send rpool/t1@t1s1 | zfs recv rpool/t2Create another file, and create a snapshot at both source and destination.touch /rpool/t1/2zfs snapshot rpool/t1@t1s2zfs snapshot rpool/t2@t2s2And now send an incremental stream from the original.zfs send -i rpool/t1@t1s1 rpool/t1@t1s2 | zfs recv -F rpool/t2That works, the whole point of the -F flag is to discard any subsequent changes. (You'll usually need this if the file system is mounted at the receiver, because even access time updates count as updates that will need to be discarded.) It will roll back rpool/t2 to the original @t1s1 snapshot, discarding the local @t2s2 snapshot, then update the rpool/t2 file system to the @t1s2 snapshot.So far so good.Now a minor variation.I create it, touch a file so I know it's made it.zfs create rpool/t1touch /rpool/t1/1OK, snapshot it and send it. zfs snapshot rpool/t1@s1zfs send rpool/t1@s1 | zfs recv rpool/t2Create another file, and create a snapshot at both source and destination.touch /rpool/t1/2zfs snapshot rpool/t1@s2zfs snapshot rpool/t2@s2And now send the incremental stream just like last time.zfs send -i rpool/t1@s1 rpool/t1@s2 | zfs recv -F rpool/t2Kaboom. This fails, reporting:cannot restore to rpool/t2@s2: destination already existsWhat? The problem is hinted at in the zfs man page, where the description of -F says:If receiving an incremental replication stream (for example,one generated by zfs send -R [-i|-I]), destroy snapshots andfile systems that do not exist on the sending side.The problem, then, is that zfs won't destroy the @s2 snapshot that exists at the receiver, because a snapshot of the same name exists in the source. It's not the same snapshot, of course, but it has the same name. This prevents the rollback, and the receive fails.Snapshot name collisions are pretty common. We have an automatic snapshot regime, so pretty much every file system we have has a daily snapshot that embeds the date, and being automatic, they all have the same name.What this means in practice is that if you have snapshots created on the receiving side, you'll have to explicitly roll the file system back to the snapshot you sent to previously, to avoid hitting name collisions.I think this behaviour is wrong, although I'm not quite confident enough to call it a bug. The point is that on the receiving side, any snapshots created after the one that was sent are irrelevant - it shouldn't matter what their names are, and I'm not at all sure why zfs even bothers checking the names of snapshots that ought to be deleted.[...]



Cats versus Petals

2016-10-05T10:31:52.590+01:00

It's become common to talk about Pets versus Cattle as the "new way" of thinking about servers.

Of course, "the new way" isn't really new - many IT shops in the mid 1990s had fully automated, reproducible, and disposable infrastructure. It's just the term that has recently become trendy, and I don't think the analogy is necessarily right.

In the original analogy, the claim was that a Pet is precious, so you care and feed for it specially. If it's sick, you nurse it back to health. Whereas if one of your herd of Cattle gets sick, you take it out back and shoot it. This is based purely on emotional attachment, and makes little business sense. The truth is more that most Pets have little financial value, whereas Cattle are intrinsically valuable. Whether sick Cattle are bursed back to health should be a pure business decision based on the value of a healthy animal compared to the cost of treating it.

Currently, I think a more appropriate analogy would be Cats versus Petals.

Let me explain.

A Cat system has a mind of its own. In fact, it isn't at all clear whether you own the system or the system owns you. Cat systems tend to be solitary and not integrate or interoperate well with others. If you have many Cat systems, they will tend to wish to go their own ways.

In contrast, Petals will be small, simple systems. You will have many, and they will be the same. While a Petal may have some value of its own, their true beauty is only visible when they are put together into larger units - flowers, for example. Different flowers are made up of different types of petals.

One point here is that if you're thinking about Pets and Cattle, you're still thinking of individual animals. With Petals, the role of holistic thinking and orchestration in producing a larger object (the flower, or even the garden) becomes clear.

In terms of terminology, your business is a garden; the services you provide are flowers; they are constructed from containers as the petals via an orchestration service that provides the stems and branches. Your job is to ensure good soil, water and light, prune, remove pests and weeds - not to create each individual Petal by hand.

If you're still herding Cats, it's time to stop and tend gardens instead.