Martin MC Brown

Distributions and standardization

By Martin MC Brown
October 03, 2005 10:50 AM EDT
If there's one aspect of Linux that has lead to its popularity it is the ability of a suitably enthused individual to produce their own distribution. It has spawned thousands of different solutions and, in turn, has lead to the creation of numerous tools and products that we all find useful; the RPM package management system was introduced to help install the packages that made up the RedHat system. Today, most software is distributed in RPM format, even if your system isn't necessarily Redhat based. The distribution model of Linux has also spawned companies: RedHat, SuSe and others would not exist without Linux. What made all of this possible? Well, the free and open source nature of course. Because we can use, modify, combine and redistribute different products together we can easily produce a distribution that contains the elements we want. Linux is, as I've explained before, not really an OS, just the Kernel that allows other bits to work. The Linux operating system is really a distribution of the Linux kernel and all the rest of the software that makes it work; the compiler, file system utilities, shells, user interfaces and so on. Collectively (and technically incorrectly) we call this collection 'Linux'. (While we're on the topic, I totally agree (again) with Eric Boutilier's recent comment that Linux is another Unix - although its easy to see why the media in particular get confused, something I commented on last week). This incorrect labeling leads to a series of other problems. First up is that statement that Linux is one of the big four operating systems: Windows, Mac OS X, Unix and Linux. It isn't, Linux is part of the big three: Windows, Mac OS X, Unix. Of course, we could go one step further here and suggest, since Mac OS X is itself based on another Unix (FreeBSD) really there are just the big two, Windows and Unix. The counter to that is that I can't (yet) take an application designed for Mac OS X and run it, verbatim, on Linux. Or Solaris. Or BSD. Even if I was using x86 hardware that wouldn't be possible. So while we can lump them all together with a generic term, we still need to be specific when we talk about compatibility between Unix variants. I'm getting off my topic here into a separate thread however, let's return to the original point... Another aspect of the incorrect labelling - remember the original thread about Linux not being an OS, even though that's how we refer to it - is that Linux is not compatible within that 'global' description. I've talked about this before, but the fundamental issue is that for all the flexibility offered by distributions within the Linux space, we have also complicated the deployment of software. That's something that is changing with the Linux Standard Base, another topic I've discussed before. In reality of course using the term Linux is a convenience. If we said 'RedHat Enterprise Linux is one of the few thousand main operating systems' it somehow isn't quite as snappy a statement, and it fails to effectively convey the fact that despite the fact there *are* thousands of Linux-kernel using distributions, they are all based around the same suite of tools and functionality. Despite all these issues (i.e. confusing terms, massive distribution spread, and occasional incompatibility), distributions are a good thing. They show that Linux is about choice. I think its highly likely that longer term, the myriads of choice will be combined and concentrated into a smaller number, or larger-audience appealing distributions. We may even see distributions being designed to target specific areas while, through the Linux Standard Base, retaining compatibility. I love the fact that I can show people Linux from a LiveCD like Kubuntu, but I can continue to use Gentoo. They are both distributions using the Linux kernel, and they support the same range of software, although I might have to compile it myself. They also, through the customization that makes a distribution, have the ability to be optimized for the work they do. For a programmer few distributions compare with the completely source-based approach of the Gentoo distribution. Similarly, Kubuntu is an amazing desktop-driven OS, Linux or otherwise, that I could easily give to my mother and have her sending email and browsing websites, even though she has been a Windows user for years. Actually, while we're on the topic of Windows, I find it interesting that Longhorn will be available in a number of different versions. Reports range from seven to thirteen, all aimed at different groups of users. Before it was one size fits all, now we're looking at customized versions for different groups of potential users. With Linux, that's a facility we've had for years, assuming you were willing to look for it, but I find it interesting that Microsoft no longer feel that the one-size fits all approach is no longer appropriate. But I'm digressing from the point again. Back to distributions, Linux and the confusing use of Linux to refer to thousands of operating systems that share a common base. What is now interesting is the launch of OpenSolaris. Here we have the public, source-based, launch of an operating system with possibly the greatest history of commercial development, deployment and an amazing existing customer base that use it in everything from the network servers that help run the Internet to the massive servers which produce your credit card statements. Solaris has an amazing history in terms of where it has been used, who it has been used by, and what that legacy means to its users. After years of being a commercial operating system - and an expensive one for the desktop user, although much more cost effective at a server level - Sun just released the source code more or less freely (depending on how you want to read the CDDL and how you plan to use Solaris; I'm not, in this piece, going to touch that topic). Solaris, compared to most OS, is old, but while in many situations you may consider that the age makes it unattractive, from a business perspective that history makes it wise. Linux is still the new kid on the block, despite being just ten years younger than components in the Solaris operating system. I'm sure that much of the popularity of Linux can be attributed to the fact that people sense and class Linux as 'new', and therefore must be better than the 'old' operating systems of Unix and Windows. Of course, Linux is, as Eric pointed out, just another flavour of Unix. It may be an independent one without the SVR4 or BSD lineage, but the basic structure of the kernel, filesystems, file layout, interface, shells, applications and most of the components that make us use it have their history in Unix. At thirteen years old, Linux still has teething troubles, even on the commodity hardware it was designed for (see my recent review for more details). Going back to Solaris, it has 23 years of serious commercial development - and that means a dedicated team of programmers, not a group of enthusiastic volunteers. That translates as 23 years of optimizing and improving the OS. Years of trials and tests to determine the most sensible layout of files, components and applications. Ultimately, 23 years of one group of programmers and project managers coming up with an agreed standard about where everything should be, what should be there, and how it should all work together. In short, Solaris is, by definition, a Unix operating system with a standard set of rules for how it works. And it has years of history to back up the decisions about how things should work. The Linux Standard Base is an interesting project, because it is trying to standardize on these decisions - file locations, library compatibility and so on - among dozens of existing, well used, operating systems. Each of them may be a Linux distribution, but each of them may well have its own idea of how to do things. Each distribution will have their own idea of what is right and that is going to cause friction. Not wanting to put a dampener on the proceedings, but care will need to be taken to ensure that the crusade to standardize Linux doesn't end up removing the very customization that makes the different distributions so attractive. Back, again, to Solaris, and we're talking about an OS which has already been through the standardization process. Many of the problems that the LSB project is trying to address have already been decided. Now Solaris is available in source form, OpenSolaris, and is open to the sort of customization only previously available with Linux and, to a less popular extent, BSD. OpenSolaris is now beginning to spawn the idea of distributions. First we had Schillix, a LiveCD distribution which lets you run Solaris entirely from CD. Now we have another LiveCD distribution from Moinak Ghosh called BeliniX. If we follow the Linux model, different distributions may lead to differences in the approach, configuraiton and compatibility of the core operating system. So if OpenSolaris starts to become available in different distributions, does that mean we're going to see the same problems that I've described in Linux-based distributions? When I wrote about standardizing Linux, Eric asked what I though about the issues of standardization in OpenSolaris. So with the release of these distributions, are we going to see the same fragmentation of the original (Solaris) base and see the same issues that we find with Linux distributions: incompatibilities and inconsistencies? It is too early to answer that conclusively - we only have two distributions (well, three if you include the original), but there is a fundamental difference between OpenSolaris and Linux, and it goes back to my earlier descriptions of Linux. Linux is just a kernel which, when combined with other FOSS tools can be released as a distribution. It is the difference between distributions not the Linux kernel which often cause problems. By contrast, OpenSolaris is an operating system. To be fair, in these early stages OpenSolaris is primarily the kernel, along with the libraries, networking and core commands that make up the base of the Solaris OS. Longer term more of the commands, libraries and components of the Solaris OS will be released. This is not a combination of disparate components combined together to make a distribution that just happens to include the Linux kernel and which, when put together, make a Linux-based OS. OpenSolaris (and Solaris on which it is based) is an operating system that includes all of the components that are normally required 'add-ons' within a Linux distribution. With Linux, we have potentially thousands of distributions where, through the Linux Standard Base, we are going to try and produce a group of standards that makes software compatible with any Linux-based distribution. That's going to be a mammoth task. With OpenSolaris we are going the other way - OpenSolaris already has the standards, or at least it should have once more of the code has been released and we have to rely less on the various FOSS tools that provide some of the functionality missing from the OpenSolaris release. Longer term, OpenSolaris is going to have the standards that the Linux Standard Base is trying so hard to achieve. That's going to give OpenSolaris a massive advantage when it comes to releasing software and providing compatibility among OpenSolaris-based distributions. Hopefully that answers the core of Eric's question, but it isn't the full answer to the issue of standardization. You see, OpenSolaris-based distributions are still going to suffer from the issues of installing FOSS software and where that ends up. If you want to run an Apache httpd web server on your OpenSolaris-based distribution there is nothing stopping you from installing the entire application and support files in /etc. What we need next is standardization of the applications that we run on these operating systems - Linux-based, OpenSolaris-based or indeed any other variety of Unix we choose, including Mac OS X, Solaris, AIX and the numerous BSD releases. That type of standardization goes beyond the work that the Linux Standard Base will achieve and far beyond the work and history that Sun has put into Solaris. I think there's plenty of scope for the OpenSolaris team to put work into coming up with standards for all the applications you might install and then adjusting the source code and installation rules of these applications. Done right, even if you do download and install Perl from source, under OpenSolaris it installs in the 'right' place. This already happens for some applications under Mac OS X - I see no reason that the principles shouldn't be extended to OpenSolaris, or indeed Linux. Postscript I nominate two new terms into the conciousness so that we can differentiate between distributions:
  • Linux-based - used to refer to OS distributions based on Linux
  • OpenSolaris-based - used to refer to OS distributions based on OpenSolaris
It won't have been me that used them first, but I will do my best to continue using them in my future posts.
Older Post: Outsourcing