DMR R.I.P.

« previous post | next post »

It's possible that you don't know who Dennis Ritchie was. Even if you do, you should read some of his obituaries, and think about the ways in which he changed the world: Steve Lohr, "Dennis Ritchie, Trailblazer in Digital Era, Dies at 70", NYT; Elizabeth Flock, "Dennis Ritchie, father of C programming language and Unix, dies at 70", Washington Post; Cade Metz, "Dennis Ritchie: The Shoulders Steve Jobs Stood On", Wired News; Mark Memmott, "Dennis Ritchie, C Programmer And Unix Co-Creator, Has Died", NPR.

Johnny Truant, commenting on that last piece, contributed a tribute that Dennis would have appreciated:

#include "stdio.h"
int main(void)
{
printf("goodbye, world\n");
return 0;
}

(Though everyone who knew Dennis, or who knows what he did for the world, would object to that return value.)

During my 15 years at Bell Labs — 1975-1990 — Dennis Ritchie's office was around the corner from mine, and we interacted from time to time in ways that were always productive (at least for me), and often amusing. Here's a small example of how I remember him.

The old-time Unix file system (or at least the file systems in Version 6 and Version 7, my first exposure to Unix)  limited file names to 14 characters. This limitation  existed partly because computer memory was a precious resource; partly because interacting with programs and editing files on an ASR-33 was slow; but mainly, I think, because what Richard Gabriel later called "the New Jersey approach" took the view that "Simplicity is the most important consideration in a design".

The Unix culture favored short identifiers in general: programs like ed, cd, ls, cat, cc, sed, su; directory names like bin, lib, etc, dev; userids like dmr, bwk, mvm. Against this background, 14 characters is a long name; and as Brian Kernighan put it in Unix for Beginners (1979), "14 characters … is enough to be descriptive". In order to have arbitrary-length file names, you'd need to add another layer of indirection to the file-system data structures; and as Richard Gabriel later wrote about the Bell Labs Unix ethos, "All reasonably expected cases should be covered. Completeness can be sacrificed in favor of any other quality. In fact, completeness must be sacrificed whenever implementation simplicity is jeopardized."

However, Unix escaped from Bell Labs — that's part of the story of how Dennis Ritchie helped change the world — and folks in Berkeley had looser (or at least different) moral standards. By 1982 or so, the Berkeley flavor of Unix had developed a file system with arbitrary (or at least much longer) possible file names. So one day, someone sent me a tar tape that had been made on a Berkeley system. And because they'd had the bad taste to take extensive advantage of those longer-than-14-character file names,  my attempt to un-tar the tape was a disaster.

Specifically, as I recall, the overlong file names were simply silently truncated; aside from often concealing their identity and purpose, this caused later files with the same initial 14 letters to overwrite earlier ones.

So I went around the corner to discuss this problem with my colleagues in the Unix research department. Someone patiently explained to me why the 14-character limit was, on balance, a Good Thing. Someone else — certainly not Dennis — may even have suggested that tar's silent truncation of file names was the Right Thing to Do. Some inconclusive theological controversy ensued.

After talking it over with Dennis, I concluded that re-writing the V7 file system would be too much trouble, as well a violation of local cultural norms, but that modifying tar would be both fairly easy and culturally acceptable. So I got the source code, and hacked tar so that when it encountered over-long file names, it mapped them into 14-character versions guaranteed to be unique, at least insofar as 14 alphanumeric characters permitted, and at the end it wrote out a file giving the table of correspondences between the original file names and the new ones.

This allowed me to get at whatever it was that was on that foreign tape, so I was satisfied. I sent the code around by email to some people that I thought might be able to use it, with a brief note explaining what it was good for, and expressing the hope that this solution would be acceptable "even to those stalwart puritans in the unix research department". Dennis wrote back that it was indeed acceptable, signing his response "Stal".

And for some time after that, he continued to use that nickname in private email to me.

Share:



36 Comments »

  1. C said,

    October 14, 2011 @ 8:02 am

    > Though everyone who knew Dennis, or who knows what he did for the world, would object to that return value.

    The zero return value refers to a Unix convention to use a zero return value to indicate success and non-zero values for error (so that it's possible to distinguish between different types of errors). From that perspective I saw the "return 0;" as actually quite appropriate and respectful.

    [(myl) It's true that a zero return value conventionally indicates successful process completion; but though death may be normal, it's not "success", in my opinion; and death at 70 is way too early, these days. Still, I take your point -- I was interpreting the metaphor in a less contextually appropriate way.]

  2. Rob said,

    October 14, 2011 @ 8:46 am

    Way back when I worked in at a software retailer (80s, not so long ago really), we carried a pretty strong selection of programming books. Invariably we would get questions way above the level that anyone should expect of a teenage retail worker (as in, if I knew the answer to that, I'd have your programming job instead of working for just over minimum wage in a mall). One way I could manage to sound a little bit smart was to recommend Kernighan and Ritchie's book.

  3. Jonathan Badger said,

    October 14, 2011 @ 8:55 am

    In the late 1980s and early 1990s, when I was an undergraduate spending too much time on Usenet, Dennis would show up on various C and UNIX-related threads. It was always touching that he thought it worth his while to participate and correct misconceptions people had about the inner workings of either.

  4. Nicholas Waller said,

    October 14, 2011 @ 10:42 am

    In the 80s I worked for Prentice Hall, the publishers of Kernighan and Ritchie's C Programming Language. That must have been one of the more profitable books in history: slimline and one-colour and cheap to print, but selling tons (I am almost certain a million copies by its 10th anniversary and the 2nd edition in 1988) at a good price (good from the publisher's point of view, and preumably the authors').

    We were told, not sure if it is true and possibly myl might know, that the authors originally reckoned they might sell 5000 copies at most.

  5. David said,

    October 14, 2011 @ 11:09 am

    As human lifetimes go, I think his was a success.

  6. Spell Me Jeff said,

    October 14, 2011 @ 11:19 am

    Thanks for the story, Mark. I program as a hobby. Lately, it's been web stuff, and of course C is the grandfather of JavaScript. C was also the first serious language I learned to program with, so it has a special place in my own view of the universe. A copy of K&R sits on a shelf about 10 feet from where I'm sitting.

  7. Spell Me Jeff said,

    October 14, 2011 @ 11:31 am

    It's timely, but odd, that so many tributes to DMR should adduce Steve Jobs. iOS may owe a debt to C, but for the longest time the de facto Mac programming environment was Pascal. I used C to code for the Mac in the early 90s, and all the tech manuals assumed you were using Pascal. The languages are similar enough that making changes was not much of hassle, but it still annoyed me.

  8. Allen said,

    October 14, 2011 @ 11:32 am

    That's a great anecdote!

    The 14 char limit, by the way, wasn't in the inodes–it was in the directory entry, which mapped the names of files to inodes. It was just a list of 2-byte inode numbers followed by 14 byte char strings. 4.2BSD extended the max file length to 255 characters by modifying the dirent structure. I didn't get involved in UNIX & C until the late 80s, via SunOS, but I remember encountering code that, to traverse directories, opened the directory directly and read it expecting that exact structure. System V later fixed those problems by banning direct dir reads and forcing use of readdir(2). I also remember having to be careful when writing code that would be ported to older systems, to use filenames 12 chars or less, because of additional chars added when the file was checked into an SCCS (1 char) or RCS archive (2 chars). Also, I think, the filesystem name-to-inode cache in BSD, and in the early System V's, worked optimally for files under 14 characters.

  9. dev/ said,

    October 14, 2011 @ 11:49 am

    Thanks for that, Mark!

    @Spell Me Jeff

    I think that the Steve Jobs references are more about Mac OS X being a UNIX-based operating system than about the C language. The introduction of OS X marked a rather major milestone in the success story of Apple in the last decade.

    At least that's the way I read it.

  10. Josh said,

    October 14, 2011 @ 12:17 pm

    Ritchie's "The C Programming Language" was my first computer book, acquired when I was about 12 years old and is in no small part responsible for my career and whatever success I have achieved up to this point, almost 20 years later.

    May he now rest in a place where all his strings are NULL terminated, and all his malloc's are free'd.

  11. Theo Vosse said,

    October 14, 2011 @ 1:03 pm

    Thanks for a great obit.

  12. Dan T. said,

    October 14, 2011 @ 1:21 pm

    Dealing reasonably with long filenames in archives from elsewhere is the sort of thing that fits in the "be liberal in what you accept and conservative in what you put out" adage, popular with Internet engineers. If you ideologically favor short names, your systems should only use them in anything they create, but you still need to cope with noncompliant input without data loss.

    Personally, I find filenames with spaces and punctuation (other than the standard dots, dashes, and underscores) to be an abomination, but computer users these days seem to think differently.

    [(myl) In the 1970s, the main issue with filesystems (and one of the most important innovations in Unix) was the question of how many fundamentally different sorts of storage objects there should be. Unix had basically just one: a "file" consisting of a linear string of bytes. All files are created (and read and opened and closed, etc.) equal, at least as far as the operating system is concerned. This was definitely not true in most of the other operating systems of that time, where (for example) it might be almost impossible for a Fortran program to write a file that the Fortran compiler could read.]

  13. Steve Kass said,

    October 14, 2011 @ 1:51 pm

    The many tributes to Dennis written in C are touching, but they would be even more touching (and not cringeworthy) if they were made to render in K&R style. White space will getcha every time.

    [(myl) The only way to do that in *&%$!# WordPress -- which firmly believes that all line-initial space characters should be deleted -- is to use some non-space characters which are given the same display color as the background. In my opinion, that would be an even greater insult to Dennis than the unindented alternative.]

  14. Dunx said,

    October 14, 2011 @ 2:33 pm

    Thank you for a very touching remembrance, Mark.

    I find your anecdote particularly poignant since I did more or less the same thing. I had tar source archive and couldn't expand it on my Acorn Archimedes because the filesystem it used – ADFS – had even stricter limits on filenames than the UNIX versions you used at the time: 11 character filenames, and no more than 72 entries in a directory.

    I called my effort tarbaby.

    I'll have to see if I still have the source for that.

  15. Theodore said,

    October 14, 2011 @ 3:18 pm

    A great story about someone who's name I know but never knew anything about.

    Also, an odd coincidence that we just lost another who could rightly be called "The Shoulders Steve Jobs Stood On", at least in the mobile phone sense: Robert Galvin.

  16. Paul Kuriakose said,

    October 14, 2011 @ 3:25 pm

    May I offer a humble correction:

    #include "stdio.h"
    int main(void)
    {
    printf("Hello Heaven & Hello Immortality\n");
    return 0;
    }

  17. Theodore said,

    October 14, 2011 @ 3:28 pm

    yikes, whose.

  18. Spell Me Jeff said,

    October 14, 2011 @ 3:44 pm

    @dev/ said,

    I think I understood it all right. It just seemed historically weird to pair DMR's very early accomplishments with Steve Jobs's very late accomplishments. Though maybe it's fair to pair what might be their most [i]significant[/i] accomplishments.

    But all that's probably overthinking it. I'm sure the authors just leapt to the first hook that looked promising.

  19. Brian said,

    October 14, 2011 @ 5:14 pm

    I myself preferred this form:

    http://www.muppetlabs.com/~breadbox/rip-dmr.html

    (It's a little strange how easy it is to anthropomorphize, not just a computer program, but a computer language.)

  20. Carl Offner said,

    October 14, 2011 @ 8:19 pm

    I most definitely don't want to take away anything from Dennis Ritchie — he was an authentic pioneer in computer languages. I would like to say, though — as someone who learned languages roughly in the order Basic — Fortran — Pascal — C, that by the time I read K&R, I was at least unconsciously irritated by the way they seemed to take some pleasure in writing C code (usually based on pointer arithmetic, God help us) that was a few characters shorter than the straightforward way of writing it, but rather harder to understand. (And this was not an algorithmic issue — it was one of coding.) A bit later, when I became a compiler writer, I was heavily indoctrinated with the notion — which I still strongly believe — that it is the programmer's job to write clear code, and it is the compiler's job to make it run as efficiently as possible. From that point of view, those parts of K&R really served no useful purpose at all. But hey, really good compilers were still in the future at that point, and I'm sure that K&R most likely thought of C as being much closer to assembly language than we would think C is now.

  21. ErikF said,

    October 14, 2011 @ 8:26 pm

    The C language itself changed quite radically during the 30-odd years that it has been in existence, too. Originally the "hello world" program was:

    main()
    {
    printf("hello, world\n");
    return 0;
    }
    and that was it. You can't even get this to compile today unless your compiler supports "traditional" mode. It's a lot of fun to look at old code and have to decipher it (sort of like reading Shakespeare or Chaucer!)

    I remember reading my dad's first edition copy of the book and being impressed that the book was very approachable, unlike the other programming textbooks that I looked at. The authors didn't make the usual mistake of throwing all sorts of jargon into the book just to sound smart; I as a 12-year old had few problems understanding what the authors were trying to do, even though I didn't quite understand everything then (UNIX? wc? What are those?)

    Thanks, Mr. Ritchie (and Mr. Thompson), for your contributions. I don't know what programming would be today without C!

  22. Chris Hennes said,

    October 14, 2011 @ 8:50 pm

    Might I suggest "return EXIT_SUCCESS;" (if you include stdlib.h, as well, of course)

  23. Marcos said,

    October 14, 2011 @ 8:55 pm

    Didn't realize you'd worked with the Unix team, Mark. Thanks for the anecdote. I never got to meat dmr personally, unfortunately.

    Thought this was fitting: http://i.imgur.com/Q1mAf.png

    Rest in peace, Mr. Ritchie.

  24. Jonathan Lundell said,

    October 14, 2011 @ 8:59 pm

    The 16-byte allocation for a directory entry also implied a 16-bit allocation for an inode number, so a limit of 64K files per file system. Not too worrisome in the days of 10MB disk drives, but painful later on. Better than the insanely short-sighted FAT12 file system, at least.

    Pace Mr Offner, it seems to me that K&R was establishing C idioms rather than obscurity for its own sake. I'd criticize C more for being so parochially tied to the PDP-11 instruction set.

  25. Jason said,

    October 15, 2011 @ 2:08 am

    [i]Specifically, as I recall, the overlong file names were simply silently truncated; aside from often concealing their identity and purpose, this caused later files with the same initial 14 letters to overwrite earlier ones.[/i]

    Behaviour typical of Unix programmers. If an unexpected condition led to the computer, suddenly and without warning, to explode with the force of 2000 pounds of TNT, doubtless the stalwarts in the Unix department would argue this is the Right Thing to Do, and the user's responsibility to avoid that condition.

  26. uwe said,

    October 15, 2011 @ 8:04 am

    @Josh, strigns are not NULL-terminated. NULL is a null pointer constant. Null character is abbreviated NUL. Thus, either "null-terminated" or "NUL-terminated".

  27. Joshua Bowles said,

    October 15, 2011 @ 9:13 am

    As a relatively young developer (and linguist) I am often reminded of how young the field of programming languages (and file systems) is. When you look at languages such as erlang, clojure, and even what are considered standard scripting languages such ruby or python, it is amazing the speed and variety of evolution in programming languages (not to mention OSX, Linux, and mobile OSs). Ritchie was one of the great second-generation computer scientists (second generation to the first generation of Turing, Church, von Neumann, etc… ).

  28. KWillets said,

    October 15, 2011 @ 11:55 am

    I prefer to think of it as

    printf( "Hello, %s", *++world );

  29. Stal was his name too! « Entertaining Research said,

    October 15, 2011 @ 2:03 pm

    [...] a C programmer and GNU/LINUX fan, it would be wrong for me not to post this story about Dennis Ritchie that Mark Lieberman has shared: The Unix culture favored short identifiers in general: programs like ed, cd, ls, cat, cc, sed, [...]

  30. bks said,

    October 15, 2011 @ 8:01 pm

    ErikF is incorrect. The line:

    return 0;

    does not appear in either the first or second edition of K&R. I have them both on my desk right now.

    –bks

  31. Nick Lamb said,

    October 16, 2011 @ 3:34 am

    bks, that's true about the "Hello, world" example from section 1.1 but K&R, even in its second edition, does use return 0; from a main function which hasn't been declared to return anything in particular — an exercise of "old-style" function declaration which would seem as strange to a modern C programmer as Shakespeare's "wherefore art though Romeo?" seems to a present day English speaker.

    On page 76 of my copy, in section 4.3, while defining the main function of a toy reverse polish calculator, the last statement is return 0;

    It's perhaps more important/interesting that ErikF believes (wrongly and presumably without having tried) that this won't compile, showing that even in computer languages where we have an absolute standard to test against there may arise superstitions among the language's users about what is permitted or not, a frequent topic of Language Log.

  32. Nick Lamb said,

    October 16, 2011 @ 3:35 am

    Over-confidence through automatic spellcheck strikes again, that should of course read "thou" not "though"

  33. KWillets said,

    October 16, 2011 @ 2:20 pm

    Implicit declarations (functions without return type specified) generate warnings in C99 at least.

    The semantics of the return value of main were always a bit iffy. On unix, they were well-defined, but I believe K&R had some doubts about whether other operating systems would or should follow the convention.

    Ironically, Kernighan is cited as the author of this program, not Ritchie.

  34. Yosemite Semite said,

    October 16, 2011 @ 8:12 pm

    @myl

    My version of WordPress has (which it calls "preformatted" in its dropdown list). It has the desired behavior — sort of. It doesn't eat leading spaces, but can't do tabs, and the line spacing is weird.

  35. Yosemite Semite said,

    October 16, 2011 @ 8:14 pm

    Oops, I actually included the tag without doing the escape thing, so it got eaten. So it should read:

    My version of WordPress has <pre> …

  36. John Cowan said,

    October 17, 2011 @ 5:40 pm

    In the Brian Raiter version (which prints "goodbye, dad") the "return 0" is fitting: it is the C language itself which is saying farewell to its progenitor and then proclaiming itself a success.

RSS feed for comments on this post · TrackBack URI

Leave a Comment