Followup to "reclaiming the Oxford English Dictionary for the public"

Kragen Sitaker kragen at
Thu Mar 16 03:37:01 EST 2006

In October, I wrote about how it would be nice for the first-edition
OED to be publicly available:

At this point I have scanned volumes 1 (A-B), 2 (C), 3 (D-E), 4 (F-G),
5 (H-K) (Paul Nguyen did the work), and parts of volume 6 (L and M,
but not yet N).  I hope to finish the 10 volumes by the end of the
week.  The volumes I have scanned so far are available in raw form
online, which is unfortunately not very practical to download.  Soon,
more practical versions of these books should be available; their home
pages are  (still incomplete)

In November, I wrote some software to make it possible to do word
lookups without relying on OCR.  I posted the software at

It's currently running at

It needs some user interface improvements.

Here are some thoughts on monetary estimates of the value of this

Online access to the current edition costs US$295 per year and is
currently available to about 30 million people, for a total value of
$8.8 billion per year.  (Most of those people pay much less because
they're part of an institutional licensing scheme.)
( lists the pricing;
says, "Thousands of libraries throughout the English-speaking world
and beyond have access to the online edition - giving more than 30
million people around the world the chance to explore 'the world's
greatest dictionary'.")

The first edition that I am putting online is inferior to the current
edition in several ways: only half of it will be available worldwide
at first due to copyright law, which will create uncertainty about
whether you can find the definition of a particular word and will have
a disproportionate effect on its value; at first, only page images
rather than ASCII text will be available, as we haven't managed to OCR
it yet, and even when we do, a great deal of proofreading work will be
needed; and it's quite outdated.

If we discount the $295 per year by a yearly factor of 1.1, which is
extremely generous, we get a total of $3059 for the next 30 years.
Adding it up to infinity, we get $3245.  If we use a more reasonable
(i.e. closer to unity) discount rate, we get a larger value.

Suppose we estimate the value of having access to the public-domain
part of the OED by reference to the version that Oxford has for sale,
discounted by:
- a factor of 6 to account for the fact that the people who have
  bothered to buy access at $295 per year are those who are unusually
  devoted to words;
- a factor of 3 to account for its incompleteness;
- a factor of 2 to account of it being out-of-date;
- a factor of 2 to account for getting page images instead of ASCII

This brings the total value of the public-domain portion down to $45
per person, or $4.09 per year per person.  Approximately 99.55% of the
world's population, or about 6.5 billion people, currently doesn't
have access to the OED.

This values the public-domain version at $26.6 billion per year, or
$293 billion overall.  (If you pick a lower discount rate, the $293
billion number becomes much larger.)  That means that every page I
scan, out of the fifteen thousand or so, produces about $19.5 million
of value for the world; that's about $9.8 billion an hour.  My hourly
wages have usually been less.

More information about the Kragen-tol mailing list