Language Log

Happy Web Day!

November 12, 2009 @ 12:11 pm · Filed by Ben Zimmer under Language and technology, Words words words

In my latest Word Routes column on the Visual Thesaurus, I consider the enormous linguistic impact of an internal memorandum published at the European Organization for Nuclear Research (CERN) on November 12, 1990. The memo, by Tim Berners-Lee and Robert Cailliau, was entitled "WorldWideWeb: Proposal for a HyperText Project," and needless to say, we've all been webified ever since. Read all about it here.

November 12, 2009 @ 12:11 pm · Filed by Ben Zimmer under Language and technology, Words words words

Permalink

18 Comments

Dan T. said,

November 12, 2009 @ 4:10 pm

Interesting… that HTML-ified version of the memo in question was last modified in 1991 (according to the page info visible through Firefox), and the code is invalid in many ways under current HTML standards, including lack of quotes around attributes containing non-name characters.
William Lockwood said,

November 12, 2009 @ 5:18 pm

Wow, no kidding! Note the use of uppercase tags and the lack of a head/body separation, as well as the rather XML-esqe DL/DT tags.

It's hard for me to imagine three or four machines for browsing the Web costing $50,000.
Richard Howland-Bolton said,

November 13, 2009 @ 8:13 am

Wouldn't that be in Swiss francs or Euros?
(And what was the rate of exchange in 1991)
Popup said,

November 13, 2009 @ 9:38 am

Yup it mentions CHF – Swiss Francs. At the time I think they were worth a little less than a dollar. (About $0.69 if I read my tables right). Still CHF 50k == USD34k.

Computers were more expensive in those days. TBL himself was playing with a NeXTcube, which at the time retailed for closer to $10'000, and unless I'm mistaken CERN had quite a lot of Sun stations in those days.
Frans said,

November 13, 2009 @ 10:11 am

@Dan T.:

[…] the code is invalid in many ways under current HTML standards, including lack of quotes around attributes containing non-name characters.

What "many ways" would that be, exactly? Except for the lack of a DTD and the issue you just mentioned, there is little wrong with it.

@William Lockwood:
Uppercase tags are perfectly valid HTML. The HTML, HEAD, and BODY start and end tags can be left out completely. The LI element, among others, does not require an end tag. HTML is not XML (XHTML is), but SGML.

To illustrate, the following will be valid regardless of the HTML DTD chosen. For this example I used HTML5, which should indubitably be the most recent HTML standard.

<!doctype HTML><TITLE>An example of valid HTML</TITLE><H1>This is valid HTML</H1>

Btw, what do you mean by "XML-esqe DL/DT tags"?
Frans said,

November 13, 2009 @ 10:14 am

Excusez le spam, please; WordPress seems to have mangled the output above, which I especially made to work just right with the preview. Perhaps this will come out better. If not, would someone please delete this message?
<!doctype HTML> <TITLE>An example of valid HTML</TITLE> <H1>This is valid HTML</H1>
Ralph Hickok said,

November 13, 2009 @ 10:17 am

Hmmmm …. November 12 is also the anniversary of professional football. On that date in 1892, William "Pudge" Heffelfinger was paid $500 to play a game for the Allegheny Athletic Association football team.

Coincidence?

……………

Yes.
Peter Taylor said,

November 13, 2009 @ 8:25 pm

@Frans: 12 errors when verified against HTML 4.01 Transitional. Three of them you've already accounted for, but there are two errors generated by <NEXTID 9>, and problems (probably) caused by the character encoding being unspecified, although possibly simply illegal characters (the ESC in line 365 looks hard to explain as a character encoding issue).
Dan T. said,

November 13, 2009 @ 8:43 pm

There are some euro signs among the authors' names (at least as displayed to me in Firefox under Windows Vista); I'm not sure what character that was intended to be.
Frans said,

November 14, 2009 @ 8:11 am

NEXTID was deprecated in HTML 2.0 or so; personally I'd never heard of it. I did say "little," not nothing. There are deprecated NAME attributes, and characters of which I really have no idea what they were supposed to convey or what encoding was intended. The main issue here, compared to HTML 2.0 and later, is that it's clearly inspired by SGML, but couldn't be correctly parsed by an SGML parser. So the way I see it, there are only three real problems, but depending on your interpretation you could say there are a few more.

No real SGML like HTML 2.0+
No DTDCharacter encoding presumably not proper for SGML (might be if it were specified)Unexpected ESC character (it definitely seems senseless, but I don't think it makes a difference in validity?)Unquoted "name" characters(Document ends without closing tags that would have to be closed nowadays)

Deprecated element: NEXTID
Deprecated attribute: NAME (note, only deprecated in this particular context)

It's really not a bad score compared to this very site, google.com, etc.,and then I'm talking about relatively decent pages.

I suppose it was mainly the sort of combined "older is worse" impression coupled with the "many ways" that I was objecting to. The first might just be my misinterpretation, but I object the latter regardless (substitute "few" instead and I completely agree). If anyone thinks the page is that bad, just take a look at microsoft.com. ;)
Frans said,

November 14, 2009 @ 8:16 am

The above should be read as an unordered list, with a nested list under "No real SGML like HTML 2.0+"

Darned preview acting like such tags are allowed.
Dan T. said,

November 14, 2009 @ 12:27 pm

It also follows the obsolete convention of using the P tag as a paragraph separator rather than a container element, something that was deprecated in 1994 with HTML 2.0, but took a decade or so to be knocked out of the head of many Web developers who learned from (bad) example of existing HTML code when the Web first became popular around 1995, creating bad examples for later waves of Web developers who developed lots of ingrained bad habits and then flooded forums and newsgroups with plaintive cries about "Why aren't my CSS rules for paragraphs operating on the first paragraph of my pages?"
Frans said,

November 14, 2009 @ 1:07 pm

I did think that looked a tad odd, but the idea of using P tags to separate paragraphs never entered my mind. I interpreted it as a line that would be more appropriately marked up with some other element, and therefore wasn't marked up at all. In absence of anything better, a P element wouldn't be a bad choice though, I'd say. Besides, HTML still doesn't have some kind of summary element, so a P element, possibly with a class="summary" attribute, would probably still be the best choice today. Nice catch.
Adam said,

November 14, 2009 @ 4:23 pm

One of my sigs, from Stoll's _Silicon Snake Oil_:

Classical Greek lent itself to the promulgation of a rich culture,
indeed, to Western civilization. Computer languages bring us
doorbells that chime with thirty-two tunes, alt.sex.bestiality, and
Tetris clones.
Sili said,

November 14, 2009 @ 5:33 pm

Classical Greek lent itself to the promulgation of a rich culture

Such as alt.sex.bestiality?
Adam said,

November 14, 2009 @ 5:46 pm

@Zili: They called it "zoophilia", which sounds much better. ;-)
webula said,

November 15, 2009 @ 2:57 pm

Read the spec, everybody. Closing tags are optional for most elements, some don't even allow them. (This is HTML, not XHTML.) Omitting DOCTYPE, HTML, HEAD and BODY is perfectly legal. (The only reason DOCTYPE is still in HTML is IE<=7 quirksmode.) Attributes don't have to be quoted in most cases. The way DL, DD and DT are used is still best practice.
Dan T. said,

November 15, 2009 @ 8:49 pm

Actually, several browsers including Firefox have different standards and quirks modes based on doctypes, but that's not the reason that declaration exists; it's part of the standards for indicating which DTD the HTML follows.

RSS feed for comments on this post

Happy Web Day!

18 Comments

Dan T. said,

William Lockwood said,

Richard Howland-Bolton said,

Popup said,

Frans said,

Frans said,

Ralph Hickok said,

Peter Taylor said,

Dan T. said,

Frans said,

Frans said,

Dan T. said,

Frans said,

Adam said,

Sili said,

Adam said,

webula said,

Dan T. said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta