The Open Source Acid Test

Kragen Sitaker kragen-discuss@kragen.dnaco.net
Thu, 11 Feb 1999 20:01:34 -0500 (EST)


[_Computer_ editors: feel free to abridge this letter for publication.
I would be happy to flesh it out and reorganize it into a full-length
article for publication if you feel that would be helpful.]

[Rob: do you want to publish this as an editorial?]

I read Ted Lewis's article, _The Open Source Acid Test_, on your web pages.

I was appalled that an organ of a prestigious international society
like the IEEE would publish such error-riddled, poorly-researched,
deliberately deceptive nonsense.  It's as if the _New England Journal
of Medicine_ had published a case study of a zombie animated by
voodoo!

The author did not cite sources for any of his dubious statistics, and
they are therefore hard to disprove.  Given the remarkable lack of
factual accuracy in the article, I doubt that they have any basis in
fact.

To begin with the most obvious errors:

- Linus Torvalds's name is not Linus Torvold.

- Applix, Tower Technology, and NewMonics do not sell open-source software.

- There is no such company as "Walnut Creek Stackware".  www.cdrom.com
	belongs to Walnut Creek CDROM.  There is no such company as
	"Tower Tech JVM".  www.twr.com belongs to Tower Technology,
	which sells a (non-open-source) JVM.  There is no such web site
	as www.debian.com.

- www.python.org is operated by the Python Software Association, not
	CNRI, although it is currently hosted on CNRI's network.

- Several of the "commercial enterprises" listed in Table 1 are not
	commercial enterprises at all.  www.hungry.com, www.python.org,
	and www.debian.org are all operated by nonprofit
	organizations.  The Corporation for National Research
	Initiatives, which was incorrectly listed as operating
	www.python.org, is actually a not-for-profit research
	organization.

- It is absurd to say that Unix was the foundation for Hewlett-Packard
	and IBM, as Lewis does in his introductory paragraph.  Both
	companies had been established for more than thirty years when
	the first line of Unix was written.

- On page 126, Lewis claims that the open-source community admits that
	its organizational structure is weak.  The evidence he adduces
	is a quote from a document published on www.opensource.org.
	What he doesn't tell you is that the document is *a leaked
	internal Microsoft memo*.  Unless Lewis missed the 115
	references to Microsoft in this document and also failed to
	read the introductory paragraphs, the only reasonable
	conclusion is that he is being deliberately deceptive.

- On page 125, Lewis claims that "Currently, Linux's installed base
	numbers 7.5 million".  As usual, he cites no source.  However,
	the most widely-cited source for such figures is Robert Young's
	paper, "Sizing the Linux Market" -- available on the Web at
	http://www.redhat.com/redhat/linuxmarket.html -- which uses
	eight different data sources to obtain an estimate of between
	five and ten million Linux users.  However, this paper has a
	date of March 1998.  If Linux's growth had continued to double
	yearly in 1998, as it did from roughly 1993 to 1998, the number
	of Linux users would be between ten and twenty million.

- On page 128, Lewis says, "Windows NT market share smothers all Unix
	dialects combined".  According to International Data
	Corporation's Server Operating Environment report, Unix and
	Linux together had 34.6% of the server market in 1998, while
	Windows NT had 36%.  See
	http://www.news.com/News/Item/0,4,30027,00.html?st.cn.nws.rl.ne
	for more information.
	The actual number of server Linux shipments IDC tallied in 1998
	was only three-quarters of a million; that suggests that if you
	include people installing multiple servers from the same CD and
	installing from Internet downloads, you would find that Linux's
	server market share is much greater than Windows NT's.

- Lewis remarks, "With few exceptions, open source software has never
	crossed the chasm into the mainstream without first becoming a
	commercial product sold by a commercial enterprise." 
	Does he think that Linux is not a commercial product sold by
	commercial enterprises?  If not, there are literally dozens of
	"exceptions" to this statement -- Perl, Apache, sendmail, BIND,
	Linux, Tcl/Tk, Berkeley DB, Samba, the X Window system, FORTH,
	GNU Emacs, and trn, for example.  Many of these became popular
	before they were commercially sold at all.

- Lewis misstates the business case for Linux and "its open source
	software cousins".  According to Eric Raymond -- whom Lewis
	quotes extensively elsewhere in this article -- a much more
	compelling business case is founded on the better quality of
	the software, choice of suppliers, choice of support and
	maintenance, freedom from legal exposure and license tracking.
	More details are available at
	<URL:http://www.opensource.org/for-buyers.html>.

These minor factual errors, so far, merely indicate that the author
knows very little about the topic he writes about and is deliberately
trying to mislead his readers; they do not directly undermine his
conclusions.  However, as I shall show, each of his supporting
arguments consist of incorrect facts and lead to faulty conclusions.

One of the author's major contentions is that as Open Source software
adds more features and becomes more comparable to proprietary software,
it will lose many of its advantages.  He cites as examples Linux's
supposed lack of video card support, wireless LAN support, and "a good
selection of productivity software."; he claims that Unix contains 10
million lines of code, while Linux contains only 1.5 million.  On page
126, he says, "Maintenance and support grow more complex, and costs
increase due to a scarcity of talented programmers.  Success leads to
features, and feature creep leads to bloated software."

With regard to video card support, it is true that the Linux kernel
does not have video card support in it.  That facility is provided by
video drivers in other software; nearly all graphical software
available for Linux uses X11 for access to those video drivers.
Open-source X11 drivers for most video cards are available from
www.xfree86.org; the list of supported cards there currently lists 555
different kinds of video cards, many of which include numerous
individual models.

For those few cards for which XFree86 support is not available,
proprietary X11 drivers are available from Xi Graphics and Metro-Link.

With XFree86, Linux's video card support is better than either Windows
98 or Windows NT, and considerably more extensive than any Unix that
does not use XFree86.

To claim that Linux lacks video card support is merely laughable.

With regard to wireless LAN support, it is true that many of the recent
wireless LAN products do not currently have support in Linux.  However,
Linux has had support for packet-radio wireless networking and several
kinds of LANs for years, and has supported several wireless LAN
products since at least late 1997, including most of the most popular
ones:
Lucent Wavelan		DEC RoamAbout DS	Lucent Wavelan IEEE
Netwave Airsurfer	Xircom Netwave		Proxim RangeLan2
Proxim Symphony		DEC RoamAbout FH	Aironet ARLAN
Raytheon Raylink	BreezeCom BreezeNet

This information is readily available on the Web in the Linux Wireless
LAN Howto.

With regard to productivity software, there are several office suites
available for Linux, and there have been for several years.  ApplixWare
and StarOffice are the two most common.

With regard to the size of Linux: first, among the utilities tested in
the failure-rate study (the latest report on which is entitled "Fuzz
Revisited: A Re-examination of the Reliability of Unix Utilities and
Services", and is available on the Web at
ftp://grilled.cs.wisc.edu/technical_papers/fuzz-revisited.ps.Z; the
quote used on page 125 appears to be from the original paper, which I
cannot find on the Web) are the standard set of Unix utilities, awk,
grep, wc, and so forth.  These utilities have a standard set of
functionality common across all Unix systems, except that the GNU
utilities tend to have a great deal of extra functionality included.
If the GNU utilities really are only one-sixth the size of the
corresponding utilities on a Unix system, yet provide much more
functionality, and still have one-third to one-sixth of the failure
rate, that is not an indictment of the defect rate of free software,
but rather a vindication of it -- which is why this study is linked to
from the Free Software Foundation's Web pages.  The study is unfairly
biased in favor of less-featureful proprietary software, and that
software still came out way behind.

(From my own experience, I know that frequently, the best workaround
for a bug in a Unix utility is to install the GNU version.)

Lewis's claim that this represents "a single-point estimate of defect
rate" is incorrect.  The paper includes detailed results of the tests
on 82 different utilities, along with aggregate statistics by operating
system.  63 of these utilities were available either from GNU or from
Linux, and were tested in this study.

With regard to the lines-of-code figure: it is not easy to measure the
number of lines of code that constitute "Linux", because it is not easy
to define what constitutes "Linux" -- or, for that matter, "Unix"
either.

If we mean just the kernel, http://www.base.com/gordoni/os-sizes.html
has some figures for the sizes of several OS kernels in 1994.  SunOS
5.2's kernel is listed as containing 680,000 lines of code, while
SunOS 5.0's kernel is listed as containing 560,000 lines of code.  If
the rate of increase per version remained constant (doubtful, because
5.0 and 5.1 weren't really finished products) then the latest SunOS
(the one that's the kernel of just-released Solaris 7) would contain
1,280,000 lines of code.

By comparison, the source code of the 2.2.1 Linux kernel totals
1,676,155 lines of code, including comments and blank lines, counting
only .c, .h, and .S (assembly) files.

The Linux project's source code has already reached a level where we
would "expect Linux defect densities to get worse".  They haven't.

On page 125, Lewis cites Apache as an example of support diminishing
when "the hype wears off", saying "it is currently supported by fewer
than 20 core members" -- implying that the "cast of thousands" is a
thing of the past.  The truth is that the core Apache team has never
been larger than 20 people, and they *still* receive contributions from
many people outside the group.  He also says that "Apache is losing the
performance battle against Microsoft's IIS."  But Apache has never been
intended to be the fastest HTTP server around -- it's already more than
fast enough to saturate a T1 when running on a puny machine, so its
developers have been concentrating on things like adding more features
and making it more reliable.

On page 128, Lewis says, "The concept of free software is a frequently
practiced strategy of the weak".  While free-as-in-price giveaways are
common -- Microsoft's Internet Explorer strategy is a perfect example
-- they are not related to open-source software, and their patterns of
success and failure have little relevance for us here.

-- 
<kragen@pobox.com>       Kragen Sitaker     <http://www.pobox.com/~kragen/>
Computers are the tools of the devil. It is as simple as that. There is no
monotheism strong enough that it cannot be shaken by Unix or any Microsoft
product. The devil is real. He lives inside C programs. -- philg@mit.edu