Donald Knuth is probably the greatest of living computer scientists and
an important contributor to the open source (he authored Tex). See
Portraits of Open
Source Pioneers-- the chapter 2 is devoted to
Donald Knuth. It also
contains additional information
about this classic book and a collection of Donald Knuth
interviews
All three volumes of The Art of Computer Programming (TAOCP), are classic.
Each is IMHO a book that every CS student should try to study re-implementing
example by example. Not many will succeed to finish even half of one volume,
but if you do please buy all three of them :-).
I think the most important is to study Vol 1. It gives enough exposition
to the Donald Knuth style and brilliant thinking. It is the level of thinking
of the author that represents the main value of the book: you instantly
understand the the book was written by a great scientist and it does not
matter much that now the contents of most chapters can be significantly
improved using more modern sources. After all Vol 1 is more then a 30 years
old book (it is older then Unix) and as such it should be outdated (we all
believe in progress, don't we)...
Please note that parts of Vol. 1 on of TAOCP looks completely out of touch
with reality especially MIX assembler.
Actually MIX assembler was outdated even when the book was first published
and more reflects unique Knuth's background with IBM 650, not so much the
state of hardware development in late 60th, the period when IBM/360 was
the king of the hill.
Now IBM 650, a 1,966 lb machine that consumed almost 30 Kw of electricity
looks more like a primitive calculator than a real computer: typical installation
has the memory of just 10,000 decimal digits ( 1,000 words; 10 digit per
word).
It's really sad that Knuth did not adopt System 360 architecture and PL/360
assembler (Wirth's structured assembler for S/360) for his books but we
can do nothing about it.
But actually the statement above is not true: this is a book about timeless
truths, not the book about the resent CS fashion like Java or you name it
:-). It actually can serve as a perfect antidote agaisnt any current CS
fashion.
And Knuth does provide pseudocode with his natural language algorithm description.
The problem with a "language-like" pseudocode is that the set of control
structures is fixed and may not reflect the needs of a particular algorithms
(branching out of loop is a common problem that is not addressed by structured
programming well) and the subsequent languages advances. For example Perl
has an interesting set of control structures that is superior to C. But
even it can be improved. That's why assembler language is preferable: it
never obscures "natural" control structures for each algorithms, that one
day can be mapped into some new elegant language construct. Also as one
review noted "sometimes high level languages with all their abstractions
make things look more complex than they need be.".
Each volume is very difficult to read; you really need to work your way
thru each chapter by reimplementing the examples that Knuth gives in your
favorite language (assembler might help but it is not essential).
Mathematical considerations as for average and worst running time a particular
algorithm can be largely ignored during the first couple of years of study
of this book. Actually most mathematics in Vol. 1 can (and probably should)
be initially completely ignored. See Softpanorama Classic computer books
for more information.
On the negative side this is an overpriced book. To save money you can buy
one of the first editions: there is not that much difference in content
to justify the differences in price.
Actually the differences are so minor that are almost unnoticeable. Knuth
did an excellent work the first time he published each volume and for a
significant improvement we probably need another century and another person.
Here is one of the more interesting (and not so sycophantic :-) reviews
of the book on Amazon
This book forms a natural sequel to the material on information structures
in Chapter 2 of Volume 1, because it adds the concept of linearly ordered
data to the other basic structural ideas.
The title "Sorting and Searching" may sound as if this book is only
for those systems programmers who are concerned with the preparation
of general-purpose sorting routines or applications to information retrieval.
But in fact the area of sorting and searching provides an ideal framework
for discussing a wide variety of important general issues:
Indeed, I believe that virtually every important aspect of programming
arises somewhere in the context of sorting or searching!
This volume comprises Chapters 5 and 6 of the complete series. Chapter
5 is concerned with sorting into order; this is a large subject that
has been divided chiefly into two parts, internal sorting and external
sorting. There also are supplementary sections, which develop auxiliary
theories about permutations (Section 5.1) and about optimum techniques
for sorting (Section 5.3). Chapter 6 deals with the problem of searching
for specified items in tables or files; this is subdivided into methods
that search sequentially, or by comparison of keys, or by digital properties,
or by hashing, and then the more difficult problem of secondary key
retrieval is considered. There searching related to sorting is a surprising
amount of interplay between both chapters, with strong analogies tying
the topics together. Two important varieties of information structures
are also discussed, in addition to those considered in Chapter 2, namely
priority queues (Section 5.2.3) and linear lists represented as balanced
trees (Section 6.2.3).
Like Volumes 1 and 2, this book includes a lot of material that does
not appear in other publications. Many people have kindly written to
me about their ideas, or spoken to me about them, and I hope that I
have not distorted the material too badly when I have presented it in
my own words.
I have not had time to search the patent literature systematically;
indeed, I decry the current tendency to seek patents on algorithms (see
Section 5.4.5). If somebody sends me a copy of a relevant patent not
presently cited in this book, I will dutifully refer to it in future
editions. However, I want to encourage people to continue the
centuries-old mathematical tradition of putting newly discovered algorithms
into the public domain. There are better ways to earn a living than
to prevent other people from making use of one's contributions to computer
science.
Before I retired from teaching, I used this book as a text for a
student's second course in data structures, at the junior-to-graduate
level, omitting most of the mathematical material. I also used the mathematical
portions of this book as the basis for graduate-level courses in the
analysis of algorithms, emphasizing especially Sections 5.1, 5.2.2,
6.3, and 6.4. A graduate-level course on concrete computational complexity
could also be based on Sections 5.3, and 5.4.4, together with Sections
4.3.3, 4.6.3, and 4.6.4 of Volume 2.
For the most part this book is self-contained, except for occasional
discussions relating to the MIX computer explained in Volume 1. Appendix
B MIX computer contains a summary of the mathematical notations used,
some of which are a little different from those found in traditional
mathematics books.
Andrew Binstock and Donald Knuth converse on the success of open
source, the problem with multicore architecture, the disappointing lack
of interest in literate programming, the menace of reusable code, and
that urban legend about winning a programming contest with a single
compilation.
Andrew Binstock: You are one of the fathers of the open-source
revolution, even if you aren’t widely heralded as such. You previously
have stated that you released
TeX as open source because of the problem of proprietary
implementations at the time, and to invite corrections to the code—both
of which are key drivers for open-source projects today. Have you been
surprised by the success of open source since that time?
Donald Knuth: The success of open source code is perhaps the only
thing in the computer field that hasn’t surprised me during
the past several decades. But it still hasn’t reached its full potential;
I believe that open-source programs will begin to be completely dominant
as the economy moves more and more from products towards services, and
as more and more volunteers arise to improve the code.
For example, open-source code can produce
thousands of binaries, tuned perfectly to the configurations of individual
users, whereas commercial software usually will exist in only a few
versions. A generic binary executable file must include
things like inefficient "sync" instructions that are totally inappropriate
for many installations; such wastage goes away when the source code
is highly configurable. This should be a huge win for open source.
Yet I think that a few programs, such as Adobe Photoshop, will always
be superior to competitors like the Gimp—for some reason, I really don’t
know why! I’m quite willing to pay good
money for really good software,
if I believe that it has been produced by the best programmers.
Remember, though, that my opinion on economic questions is highly
suspect, since I’m just an educator and scientist. I understand almost
nothing about the marketplace.
Andrew: A story states that you once entered a programming
contest at Stanford (I believe) and you submitted the winning entry,
which worked correctly after a single compilation. Is this
story true? In that vein, today’s developers frequently build programs
writing small code increments followed by immediate compilation and
the creation and running of unit tests. What are your thoughts on this
approach to software development?
Donald: The story you heard is typical of legends that are based
on only a small kernel of truth. Here’s what actually happened:
John McCarthy decided in 1971 to have a Memorial Day Programming
Race. All of the contestants except me worked at his AI Lab up in the
hills above Stanford, using the WAITS time-sharing system; I was down
on the main campus, where the only computer available to me was a mainframe
for which I had to punch cards and submit them for processing in batch
mode. I used Wirth’s
ALGOL W system (the predecessor of Pascal). My program didn’t
work the first time, but fortunately I could use Ed Satterthwaite’s
excellent offline debugging system for ALGOL W, so I needed only two
runs. Meanwhile, the folks using WAITS couldn’t get enough machine cycles
because their machine was so overloaded. (I think that the second-place
finisher, using that "modern" approach, came in about an hour after
I had submitted the winning entry with old-fangled methods.) It wasn’t
a fair contest.
As to your real question, the idea of immediate compilation and "unit
tests" appeals to me only rarely, when I’m feeling my way in a totally
unknown environment and need feedback about what works and what doesn’t.
Otherwise, lots of time is wasted on activities that I simply never
need to perform or even think about. Nothing needs to be "mocked up."
Andrew: One of the emerging problems for developers, especially
client-side developers, is changing their thinking to write programs
in terms of threads. This concern, driven by the advent of inexpensive
multicore PCs, surely will require that many algorithms be recast for
multithreading, or at least to be thread-safe. So far, much of the work
you’ve published for Volume 4 of
The Art of Computer Programming (TAOCP) doesn’t
seem to touch on this dimension. Do you expect to enter into problems
of concurrency and parallel programming in upcoming work, especially
since it would seem to be a natural fit with the combinatorial topics
you’re currently working on?
Donald: The field of combinatorial algorithms is so vast that I’ll
be lucky to pack its sequential aspects into three or four
physical volumes, and I don’t think the sequential methods are ever
going to be unimportant. Conversely, the half-life of parallel techniques
is very short, because hardware changes rapidly and each new machine
needs a somewhat different approach. So I decided long ago to stick
to what I know best. Other people understand parallel machines much
better than I do; programmers should listen to them, not me, for guidance
on how to deal with simultaneity.
Andrew: Vendors of multicore processors have expressed frustration
at the difficulty of moving developers to this model. As a former professor,
what thoughts do you have on this transition and how to make it happen?
Is it a question of proper tools, such as better native support for
concurrency in languages, or of execution frameworks? Or are there other
solutions?
Donald: I don’t want to duck your question entirely. I might as well
flame a bit about my personal unhappiness with the current trend toward
multicore architecture. To me, it looks more or less like the hardware
designers have run out of ideas, and that they’re trying to pass the
blame for the future demise of Moore’s Law to the software writers by
giving us machines that work faster only on a few key benchmarks! I
won’t be surprised at all if the whole multithreading idea turns out
to be a flop, worse than the "Titanium"
approach that was supposed to be so terrific—until it turned out that
the wished-for compilers were basically impossible to write.
Let me put it this way: During the past 50 years, I’ve written well
over a thousand programs, many of which have substantial size. I can’t
think of even five of those programs that would have been enhanced
noticeably by parallelism or multithreading. Surely, for example, multiple
processors are no help to TeX.[1]
How many programmers do you know who are enthusiastic about these
promised machines of the future? I hear almost nothing but grief from
software people, although the hardware folks in our department assure
me that I’m wrong.
I know that important applications for parallelism exist—rendering
graphics, breaking codes, scanning images, simulating physical and biological
processes, etc. But all these applications require dedicated code and
special-purpose techniques, which will need to be changed substantially
every few years.
Even if I knew enough about such methods to write about them in
TAOCP, my time would be largely wasted, because soon there
would be little reason for anybody to read those parts. (Similarly,
when I prepare the third edition of
Volume 3 I plan to rip out much of the material about how to sort
on magnetic tapes. That stuff was once one of the hottest topics in
the whole software field, but now it largely wastes paper when the book
is printed.)
The machine I use today has dual processors. I get to use them both
only when I’m running two independent jobs at the same time; that’s
nice, but it happens only a few minutes every week. If I had four processors,
or eight, or more, I still wouldn’t be any better off, considering the
kind of work I do—even though I’m using my computer almost every day
during most of the day. So why should I be so happy about the future
that hardware vendors promise? They think a magic bullet will come along
to make multicores speed up my kind of work; I think it’s a pipe dream.
(No—that’s the wrong metaphor! "Pipelines" actually work for me, but
threads don’t. Maybe the word I want is "bubble.")
From the opposite point of view, I do grant that web browsing probably
will get better with multicores. I’ve been talking about my technical
work, however, not recreation. I also admit that I haven’t got many
bright ideas about what I wish hardware designers would provide instead
of multicores, now that they’ve begun to hit a wall with respect to
sequential computation. (But my
MMIX
design contains several ideas that would substantially improve the current
performance of the kinds of programs that concern me most—at the cost
of incompatibility with legacy x86 programs.)
Andrew: One of the few projects of yours that hasn’t been
embraced by a widespread community is
literate programming.
What are your thoughts about why literate programming didn’t catch on?
And is there anything you’d have done differently in retrospect regarding
literate programming?
Donald: Literate programming is a very personal thing. I think it’s
terrific, but that might well be because I’m a very strange person.
It has tens of thousands of fans, but not millions.
In my experience, software created with literate programming has
turned out to be significantly better than software developed in more
traditional ways. Yet ordinary software is usually okay—I’d give it
a grade of C (or maybe C++), but not F; hence, the traditional methods
stay with us. Since they’re understood by a vast community of programmers,
most people have no big incentive to change, just as I’m not motivated
to learn Esperanto even though it might be preferable to English and
German and French and Russian (if everybody switched).
Jon Bentley
probably hit the nail on the head when he once was asked why literate
programming hasn’t taken the whole world by storm. He observed that
a small percentage of the world’s population is good at programming,
and a small percentage is good at writing; apparently I am asking everybody
to be in both subsets.
Yet to me, literate programming is certainly the most important thing
that came out of the TeX project. Not only has it enabled me to write
and maintain programs faster and more reliably than ever before, and
been one of my greatest sources of joy since the 1980s—it has actually
been indispensable at times. Some of my major programs, such
as the MMIX meta-simulator, could not have been written with any other
methodology that I’ve ever heard of. The complexity was simply too daunting
for my limited brain to handle; without literate programming, the whole
enterprise would have flopped miserably.
If people do discover nice ways to use the newfangled multithreaded
machines, I would expect the discovery to come from people who routinely
use literate programming. Literate programming is what you need to rise
above the ordinary level of achievement. But I don’t believe in forcing
ideas on anybody. If literate programming isn’t your style, please forget
it and do what you like. If nobody likes it but me, let it die.
On a positive note, I’ve been pleased to discover that the conventions
of CWEB are already standard equipment within preinstalled software
such as Makefiles, when I get off-the-shelf Linux these days.
Andrew: In
Fascicle 1 of Volume 1, you reintroduced the MMIX computer,
which is the 64-bit upgrade to the venerable MIX machine comp-sci students
have come to know over many years. You previously described MMIX in
great detail in
MMIXware.
I’ve read portions of both books, but can’t tell whether the Fascicle
updates or changes anything that appeared in MMIXware, or whether it’s
a pure synopsis. Could you clarify?
Donald: Volume 1 Fascicle 1 is a programmer’s introduction, which
includes instructive exercises and such things. The MMIXware book is
a detailed reference manual, somewhat terse and dry, plus a bunch of
literate programs that describe prototype software for people to build
upon. Both books define the same computer (once the errata to MMIXware
are incorporated from my website). For most readers of TAOCP,
the first fascicle contains everything about MMIX that they’ll ever
need or want to know.
I should point out, however, that MMIX isn’t a single machine; it’s
an architecture with almost unlimited varieties of implementations,
depending on different choices of functional units, different pipeline
configurations, different approaches to multiple-instruction-issue,
different ways to do branch prediction, different cache sizes, different
strategies for cache replacement, different bus speeds, etc. Some instructions
and/or registers can be emulated with software on "cheaper" versions
of the hardware. And so on. It’s a test bed, all simulatable with my
meta-simulator, even though advanced versions would be impossible to
build effectively until another five years go by (and then we could
ask for even further advances just by advancing the meta-simulator specs
another notch).
Suppose you want to know if five separate multiplier units and/or
three-way instruction issuing would speed up a given MMIX program. Or
maybe the instruction and/or data cache could be made larger or smaller
or more associative. Just fire up the meta-simulator and see what happens.
Andrew: As I suspect you don’t use unit testing with MMIXAL,
could you step me through how you go about making sure that your code
works correctly under a wide variety of conditions and inputs? If you
have a specific work routine around verification, could you describe
it?
Donald: Most examples of machine language code in TAOCP
appear in Volumes 1-3; by the time we get to Volume 4, such low-level
detail is largely unnecessary and we can work safely at a higher level
of abstraction. Thus, I’ve needed to write only a dozen or so MMIX programs
while preparing the opening parts of Volume 4, and they’re all pretty
much toy programs—nothing substantial. For little things like that,
I just use informal verification methods, based on the theory that I’ve
written up for the book, together with the MMIXAL assembler and MMIX
simulator that are readily available on the Net (and described in full
detail in the MMIXware book).
That simulator includes debugging features like the ones I found
so useful in Ed Satterthwaite’s system for ALGOL W, mentioned earlier.
I always feel quite confident after checking a program with those tools.
Andrew: Despite its formulation many years ago, TeX is still
thriving, primarily as the foundation for
LaTeX. While
TeX has been effectively frozen at your request, are there features
that you would want to change or add to it, if you had the time and
bandwidth? If so, what are the major items you add/change?
Donald: I believe changes to TeX would cause much more harm than
good. Other people who want other features are creating their own systems,
and I’ve always encouraged further development—except that nobody should
give their program the same name as mine. I want to take permanent responsibility
for TeX and Metafont,
and for all the nitty-gritty things that affect existing documents that
rely on my work, such as the precise dimensions of characters in the
Computer Modern fonts.
Andrew: One of the little-discussed aspects of software development
is how to do design work on software in a completely new domain. You
were faced with this issue when you undertook TeX: No prior art was
available to you as source code, and it was a domain in which you weren’t
an expert. How did you approach the design, and how long did it take
before you were comfortable entering into the coding portion?
Donald: That’s another good question! I’ve discussed the answer in
great detail in Chapter 10 of my book
Literate Programming, together with Chapters 1 and 2 of my book
Digital Typography. I think that anybody who is really interested
in this topic will enjoy reading those chapters. (See also Digital
Typography Chapters 24 and 25 for the complete first and second
drafts of my initial design of TeX in 1977.)
Andrew: The books on TeX and the program itself show a clear
concern for limiting memory usage—an important problem for systems of
that era. Today, the concern for memory usage in programs has more to
do with cache sizes. As someone who has designed a processor in software,
the issues of cache-aware and
cache-oblivious
algorithms surely must have crossed your radar screen. Is
the role of processor caches on algorithm design something that you
expect to cover, even if indirectly, in your upcoming work?
Donald: I mentioned earlier that MMIX provides a test bed for many
varieties of cache. And it’s a software-implemented machine, so we can
perform experiments that will be repeatable even a hundred years from
now. Certainly the next editions of Volumes 1-3 will discuss the behavior
of various basic algorithms with respect to different cache parameters.
In Volume 4 so far, I count about a dozen references to cache memory
and cache-friendly approaches (not to mention a "memo cache," which
is a different but related idea in software).
Andrew: What set of tools do you use today for writing
TAOCP? Do you use TeX? LaTeX? CWEB? Word processor? And what
do you use for the coding?
Donald: My general working style is to write everything first with
pencil and paper, sitting beside a big wastebasket. Then I use Emacs
to enter the text into my machine, using the conventions of TeX. I use
tex, dvips, and gv to see the results, which appear on my screen almost
instantaneously these days. I check my math with Mathematica.
I program every algorithm that’s discussed (so that I can thoroughly
understand it) using CWEB, which works splendidly with the GDB debugger.
I make the illustrations with
MetaPost (or, in
rare cases, on a Mac with Adobe Photoshop or Illustrator). I have some
homemade tools, like my own spell-checker for TeX and CWEB within Emacs.
I designed my own bitmap font for use with Emacs, because I hate the
way the ASCII apostrophe and the left open quote have morphed into independent
symbols that no longer match each other visually. I have special Emacs
modes to help me classify all the tens of thousands of papers and notes
in my files, and special Emacs keyboard shortcuts that make bookwriting
a little bit like playing an organ. I prefer
rxvt to xterm for terminal
input. Since last December, I’ve been using a file backup system called
backupfs, which
meets my need beautifully to archive the daily state of every file.
According to the current directories on my machine, I’ve written
68 different CWEB programs so far this year. There were about 100 in
2007, 90 in 2006, 100 in 2005, 90 in 2004, etc. Furthermore, CWEB has
an extremely convenient "change file" mechanism, with which I can rapidly
create multiple versions and variations on a theme; so far in 2008 I’ve
made 73 variations on those 68 themes. (Some of the variations are quite
short, only a few bytes; others are 5KB or more. Some of the CWEB programs
are quite substantial, like the 55-page BDD package that I completed
in January.) Thus, you can see how important literate programming is
in my life.
I currently use Ubuntu Linux,
on a standalone laptop—it has no Internet connection. I occasionally
carry flash memory drives between this machine and the Macs that I use
for network surfing and graphics; but I trust my family jewels only
to Linux. Incidentally, with Linux I much prefer the keyboard focus
that I can get with classic
FVWM to the GNOME and
KDE environments that other people seem to like better. To each his
own.
Andrew: You state in the preface of
Fascicle 0 of Volume 4 of TAOCP that Volume 4 surely
will comprise three volumes and possibly more. It’s clear from the text
that you’re really enjoying writing on this topic. Given that, what
is your confidence in the note posted on the TAOCP website
that Volume 5 will see light of day by 2015?
Donald: If you check the Wayback Machine for previous incarnations
of that web page, you will see that the number 2015 has not been constant.
You’re certainly correct that I’m having a ball writing up this material,
because I keep running into fascinating facts that simply can’t be left
out—even though more than half of my notes don’t make the final cut.
Precise time estimates are impossible, because I can’t tell until
getting deep into each section how much of the stuff in my files is
going to be really fundamental and how much of it is going to be irrelevant
to my book or too advanced. A lot of the recent literature is academic
one-upmanship of limited interest to me; authors these days often introduce
arcane methods that outperform the simpler techniques only when the
problem size exceeds the number of protons in the universe. Such algorithms
could never be important in a real computer application. I read hundreds
of such papers to see if they might contain nuggets for programmers,
but most of them wind up getting short shrift.
From a scheduling standpoint, all I know at present is that I must
someday digest a huge amount of material that I’ve been collecting and
filing for 45 years. I gain important time by working in batch mode:
I don’t read a paper in depth until I can deal with dozens of others
on the same topic during the same week. When I finally am ready to read
what has been collected about a topic, I might find out that I can zoom
ahead because most of it is eminently forgettable for my purposes. On
the other hand, I might discover that it’s fundamental and deserves
weeks of study; then I’d have to edit my website and push that number
2015 closer to infinity.
Andrew: In late 2006, you were diagnosed with prostate cancer.
How is your health today?
Donald: Naturally, the cancer will be a serious concern. I have superb
doctors. At the moment I feel as healthy as ever, modulo being 70 years
old. Words flow freely as I write TAOCP and as I write the
literate programs that precede drafts of TAOCP. I wake up in
the morning with ideas that please me, and some of those ideas actually
please me also later in the day when I’ve entered them into my computer.
On the other hand, I willingly put myself in God’s hands with respect
to how much more I’ll be able to do before cancer or heart disease or
senility or whatever strikes. If I should unexpectedly die tomorrow,
I’ll have no reason to complain, because my life has been incredibly
blessed. Conversely, as long as I’m able to write about computer science,
I intend to do my best to organize and expound upon the tens of thousands
of technical papers that I’ve collected and made notes on since 1962.
Andrew: On your website, you mention that the
Peoples Archive
recently made a series of videos in which you reflect on your past life.
In segment 93, "Advice to Young People," you advise that people shouldn’t
do something simply because it’s trendy. As we know all too well, software
development is as subject to fads as any other discipline. Can you give
some examples that are currently in vogue, which developers shouldn’t
adopt simply because they’re currently popular or because that’s the
way they’re currently done? Would you care to identify important examples
of this outside of software development?
Donald: Hmm. That question is almost contradictory, because I’m basically
advising young people to listen to themselves rather than to others,
and I’m one of the others. Almost every biography of every person whom
you would like to emulate will say that he or she did many things against
the "conventional wisdom" of the day.
Still, I hate to duck your questions even though I also hate to offend
other people’s sensibilities—given that software methodology has always
been akin to religion. With the caveat that there’s no reason anybody
should care about the opinions of a computer scientist/mathematician
like me regarding software development, let me just say that almost
everything I’ve ever heard associated with the term "extreme
programming" sounds like exactly the wrong way to go...with one
exception. The exception is the idea of working in teams and reading
each other’s code. That idea is crucial, and it might even mask out
all the terrible aspects of extreme programming that alarm me.
I also must confess to a strong bias against the fashion for reusable
code. To me, "re-editable code" is much, much better than an untouchable
black box or toolkit. I could go on and on about this. If you’re totally
convinced that reusable code is wonderful, I probably won’t be able
to sway you anyway, but you’ll never convince me that reusable code
isn’t mostly a menace.
Here’s a question that you may well have meant to ask: Why is the
new book called Volume 4 Fascicle 0, instead of Volume 4 Fascicle 1?
The answer is that computer programmers will understand that I wasn’t
ready to begin writing Volume 4 of TAOCP at its true beginning
point, because we know that the initialization of a program can’t be
written until the program itself takes shape. So I started in 2005 with
Volume 4 Fascicle 2, after which came Fascicles 3 and 4. (Think of
Star Wars, which began with Episode 4.)