Around the water cooler
« previous post | next post »
Gene Buckley quoted this sentence from James R. Glenn, "The Sound Recordings of John P. Harrington: A Report on Their Disposition and State of Preservation", Anthropological Linguistics 33(4): 357-366, 1991:
[NAA] also anticipates that, once data editing is complete, information about both the Harrington sound recordings and photographs will be available on INTERNET, to which the Smithsonian recently subscribed.
Gene noted that the use of INTERNET with no article is an interesting relic of 1991 usage, and observed that for him, ARPANET never made the transition to usage with "the" (or, apparently, to lower case).
I remarked idly that "on INTERNET" is like "on Facebook", "on Google", or "on Language Log", and that when some elderly politician talks about looking something up "on the Google", you know that he doesn't quite Get It.
Geoff Pullum observed that
So Google, Language Log, and cyberspace are like Amsterdam and Vanuatu, while the Internet is like the Hague and the Solomon Islands. As I put it back in 2007 "Language Log is strong".
Now, my default hypothesis is that this is a genuinely arbitrary syntactic distinction. There's no explanation; the functionalists who (doubtless) will run around in circles trying to find a subtle semantic link between all strong proper names, and a subtle distinction between them and weak proper names, will be wasting their time. "The Internet" is a weak proper name, so the definite article is obligatory. End of story.
The question is whether anyone can propose anything to be said about this topic that can enliven my (admittedly rather dull) description and give it some semantic rationale or explanatory oomph. I am betting (though only a modest amount) that the answer is no.
Comments are nevertheless open.
[Further LL discussion can be found here, here, and here. And here.]
[And even if the distinction is an arbitrary one, there may be something to be learned by documenting the process by which recent words changed category, as internet apparently did.]
alex boulton said,
March 10, 2010 @ 11:00 am
You can use it without the determiner in French: je l'ai trouvé sur internet. But I don't know what that tells us.
Army1987 said,
March 10, 2010 @ 11:29 am
In Italian it's always without the article, and always with a capital I. On the other hand, it appears to have switched grammatical gender (from feminine to masculine) sometime in the last decade or so, though I'm not entirely sure of that.
Has anyone ever noticed a correlation between the "strength" of proper names and whether they contain a common noun? My impression is that usually they take no article if they don't (e.g. Italy), and an article if they do (e.g. the United States).
[(myl) There are certainly exceptions in both directions: "The Hague" and "The Azores", but "Northern Liberties" and "Lost River".]
Army1987 said,
March 10, 2010 @ 11:33 am
I seem to remember a film where a CIA member was asked "Why don't you use the article before CIA?" and replied "Do you use any article before God?"
Daan said,
March 10, 2010 @ 11:34 am
And in Dutch, both op het internet 'on the internet' and op internet 'on [the] internet' work, as does op Google 'on Google'. But not *op de Google 'on the Google'.
John said,
March 10, 2010 @ 11:39 am
@Army1987:
A few cases where your rule-of-thumb doesn't work: anything of the form "X Mountain," "X Hill," or "X Valley" (where X is a variable); King of Prussia (a city in Pennsylvania).
But your rule does seem to cover many common cases (e.g. "X Island," "X Peninsula," and others). For King of Prussia, at least, there's a plausible functionalist explanation for it being an exception (wouldn't want to confuse the city with Frederick II).
John said,
March 10, 2010 @ 11:43 am
Also: it seems that British-style names of public houses usually consist of common noun phrases and are usually weak, e.g. "The Fox and Hound," "The Greene King," etc.
Supposedly the city "King of Prussia" was named after an old pub whose name somehow transferred to the entire town. I wonder if in the transition it went from weak to strong.
[(myl) That's apparently what happened to Lost River, West Virginia, which Wikipedia describes as "an unincorporated community on the Lost River in eastern Hardy County, West Virginia". On the other hand, it's not clear whether Turkey Bone, West Virginia, was named after any anything once spoken of as "the turkey bone".]
Mr. Shiny & New said,
March 10, 2010 @ 11:46 am
One thing that may explain the particular usage of "The Internet" is that at some point computer industry people used the word "internet" to describe certain kinds of networks but "The Internet" was THE interent, the big one, that everyone was connected to, whereas the other networks were just "AN internet".
http://en.wikipedia.org/wiki/Internet_capitalization_conventions
NW said,
March 10, 2010 @ 11:47 am
Interestingly, Solomon Islands, like Maldives, is officially strong. It seems to make it more like a single entity rather than a geographical collection. Of course neither they nor The Gambia consistently use the right strength even in official text.
Mike said,
March 10, 2010 @ 11:50 am
Back when the internet started, there were all sorts of named networks like Arpanet, Bitnet, and usenet. These received their names from their creators and sponsors. The internet was seen as something tying them together, not as a network in itself. The term was descriptive, like "the highway", and not a name at all.
Rob Malouf said,
March 10, 2010 @ 11:51 am
I vaguely remember it always being "the Internet". I was a VAX/VMS system manager in the late 80's and there were lots of networks we had to deal with: ARPAnet, CSnet, BITNET, HEPnet, UUCP, etc., plus the various physical networks like DECnet and Ethernet. The Internet wasn't exactly a new network; it was a scheme to connect and regularize a couple of the existing networks. But then by 1991 I was deeply embedded in Montague Grammar and wasn't paying close attention to computer network, so I might have this wrong.
As an aside, I've always wonder what would happen if we went back to having to manually route email messages. Lots less spam, I'll bet.
greg said,
March 10, 2010 @ 11:51 am
One thing that may explain the particular usage of "The Internet" is that at some point computer industry people used the word "internet" to describe certain kinds of networks but "The Internet" was THE interent, the big one, that everyone was connected to, whereas the other networks were just "AN internet".
http://en.wikipedia.org/wiki/Internet_capitalization_conventions
Interesting that the article only refers to "the ARPANET" and not just to "ARPANET"
peter said,
March 10, 2010 @ 11:53 am
Washington insiders talk of "CIA", while most everyone else in the world calls it "the CIA".
Ginger Yellow said,
March 10, 2010 @ 11:53 am
I have memories of "internet" often being capitalised in the (British) press for quite some time after the word was in common usage. Pretty much everyone seemed to switch switch 7-10 years ago. Someone with better corpus searching tools than I have could find out when exactly it happened.
MattF said,
March 10, 2010 @ 11:54 am
There's this.
MattF said,
March 10, 2010 @ 11:55 am
No, I mean here's this.
Ellen K. said,
March 10, 2010 @ 11:55 am
With "internet", I suspect many people think of it as a common noun, thus the "the", just as we'd say "the table", "the door", etc, when we have one specific table or door that is clearly being referenced.
Circeus said,
March 10, 2010 @ 12:09 pm
@Alex Boulton: Ah, but you're forgetting the huge amount of peeving over which preposition to use between "sur" and "dans", and the fine line between the Web and the Internet. I can't be arsed to remember which word takes capitals either. OQLF note: (my translation) "The preposition sur is very frequent, even though dans has the advantage of removing the ambiguity found in such expressions as trouver un renseignement sur Internet, where it's not clear wheter what is meant is "about the Internet" or "using the Internet"."
This is a red flag, of course. Classical example of ambiguity virtually never present in context (after all, the same ambiguity is found with English "on" and nobody ever complains about it).
D. Sky Onosson said,
March 10, 2010 @ 12:17 pm
As an aside on placename naming conventions re: Turkey Bone, W. Va., there are also quite a number of similar examples of "common noun phrases" used as placenames in Canada without an article, among them Moose Jaw, Medicine Hat, and the not-exactly-common Head-Smashed-In Buffalo Jump. Interestingly the first two names were derived in different ways: Moose Jaw is apparently simply an anglicized approximation of a Cree phrase meaning "warm place by the river", while Medicine Hat is a translation of the Blackfoot name for the location. As for Head-Smashed-In, you can read about that yourselves…
jfruh said,
March 10, 2010 @ 12:42 pm
Speaking of geographic names with this feature, I seem to recall that "the Ukraine" became just plain "Ukraine" right around the time that the USSR broke up. I suppose this is some legacy of the fact that the name was originally in Russian/Ukranian/Ruthenian a common noun meaning "borderland"? Wikipedia notes that the shift happened but doesn't offer even the usual half-baked Wikisplanations as to why. It does mention a few other countries that this phenomenon happened to, including Sudan/the Sudan, where the shift, as near as I can recall, has happened more slowly.
Matty said,
March 10, 2010 @ 12:47 pm
My personal recollection is that, previously the idea was there were many internets. "The internet" as we know it, was first called the "world wide web". It was too fine a technical distinction to explain to lay people that while the www was an internet, internet had other meanings (several intranets and/or independent servers hooked together). Whether it's because the internet is simply shorter and easier to say than WWW, or because it's more inclusive ("the internet" generally referring to all internet type systems, whereas WWW being more specifically the central dns accessible server), it seems to have won out in usage.
greg said,
March 10, 2010 @ 12:55 pm
d sky onosson – and of course, there is always Bat Cave, NC, named after a local mountain, named because of a local cave, named because bats lived there. so very original ;-)
jfruh – I believe that the shift came about because the new Ukrainian government either legislated or requested media agencies to refer to the nation without the preceding "the". this was partially done to emphasize Ukraine's distinctness and independence as opposed to the Soviet "the Ukraine" references which suggested that it was simply a general area that was subordinate to the greater whole and not an area with an actual national identity.
Army1987 said,
March 10, 2010 @ 1:02 pm
@Matty: No, the WWW is a narrower concept. Simplifying, it's what you access using a web browser, whereas the Internet also comprises e-mail, Usenet, file-sharing networks, instant messengers, and all that.
Mark Gould said,
March 10, 2010 @ 1:05 pm
@Matty, I am afraid your recollection is erroneous in at least one important respect. WWW is one protocol available on the internet. Others include SMTP (e-mail), gopher, NNTP (Usenet news — also available on other networks). The internet existed before Tim Berners-Lee invented WWW.
Peter Harvey said,
March 10, 2010 @ 1:08 pm
In Spanish you find something 'en Internet'. In English it's a thing. In Spanish it's a place.
Dan T. said,
March 10, 2010 @ 1:16 pm
On the other hand, the World Wide Web is capable of encompassing other protocols besides its native HTTP; there are URI schemes such as "mailto:", "news:", "telnet:", and so on. However, those things, and the Internet, predated the birth of the Web.
Licia said,
March 10, 2010 @ 1:32 pm
As mentioned by Army1987, in Italian Internet is always without any article but Facebook is often preceded by a determinative article, e.g. "sul Facebook" is becoming more and more common.
Andrew (not the same one) said,
March 10, 2010 @ 1:47 pm
I tend to agree with Mike and Ellen K that 'the internet' is common usage is not a name. I think it's interesting that the quoted artice refers to 'INTERNET to which the Smithsonian recently subscribed'. At that time it was a service one could subscribe to, one among other such services – now, since the other nets have been swallowed up, the internet is just the whole collection of inter-computer connections. (I'm not disputing Prof. Pullum's claim that there is no principled distinction between strong and weak names; it's just that I don't think 'the internet' illustrates this point.)
Regarding the internet and the Web – while those who have pointed out the distinction are in principle absolutely right, I think the distinction is fading, because other internet protocols are nowadays increasingly accessed through the Web, so whatever one does on the net one also does on the Web, and there is no real point keeping them apart.
jfruh said,
March 10, 2010 @ 1:53 pm
@Andrew: I think it's interesting that the quoted artice refers to 'INTERNET to which the Smithsonian recently subscribed'. At that time it was a service one could subscribe to, one among other such services – now, since the other nets have been swallowed up, the internet is just the whole collection of inter-computer connections.
I'm not sure if that's quite the case. It may just be that "subscribe" was the best verb the author or contemporary users could come up with to describe a novel act, that of getting the Smithosonian's computers connected to this new network. Today you might still say that you "subscribe" to AOL or to Verizon's DSL service or something, which provides a connection to the Internet; at the time the article was written, before there were intermediary ISPs, a government agency like the Smithsonian might have paid for a direction connection to the Internet backbone, which might be reasonalby described as "subscribing to the Internet" (or "to INTERNET").
Boris said,
March 10, 2010 @ 1:53 pm
@Peter,
I'm not sure you can conclusively say whether the internet is a "thing" or a "place" in English (I can't speak for Spanish). While you are on the internet like you would be on the bus, you log in and out like you might enter or leave a store (you can also log on and off, but that just adds to the ambiguity). The tongue-in-cheek "I'm from the Internet" seems to rely on the Internet being a place (then again "I'm from the CIA" is also perfectly normal. Is the CIA a place? Certainly not. Is it a thing? Define thing). The internet is also everywhere, so it can't really be a place or a thing. Or is it?
Matty said,
March 10, 2010 @ 1:57 pm
I did say the internet is more inclusive than the web, I just didn't want to get into too many specifics (because the discussion is about general usage of the common term).
When the average person says "the internet" they are not considering differences between news:, WWW, telnet servers, torrents, ftps, etc.
I pointed to the world wide web as an ancestor of "the internet" as a common term, not to conflate the meaning of the two terms. Before the web, there was not a very popular awareness of internet technologies. Dialing into a telnet server and so on were niche activities using more technical and exact distinctions than are allowed by reference to some all-encompassing internet.
Anyway, referring to the internet (and somewhat similarly, the web) in the definite article denotes a general rather than specific or technical reference. Conversly, google, facebook, language log, etc. have retained their uniqueness, and don't need an article to qualify them.
Brett said,
March 10, 2010 @ 3:14 pm
@Mark Gould- Actually, Tim Berners-Lee thought Web URLs should start with "http://web."
Mark F said,
March 10, 2010 @ 3:21 pm
The Methodist Home for Children used to be an orphanage in Raleigh, NC. When they changed their model to that of a community-based agency, they changed the name to just Methodist Home for Children.
I think Geoff is right that which proper nouns are weak is determined by accidents of history, but I'm not sure that needs to be the end of the story. Using the article is certainly more characteristic of common nouns, so you could look at the kinds of accidents of history that can lead to the "The" being there. I suspect there's a regularization pressure towards The-dropping in place names, for instance, and there's also the phenomenon of people making a conscious decision to drop the The to make the name sound more like a name. Are heavily-used weak proper nouns more likely to stay weak?
Ellen K. said,
March 10, 2010 @ 3:33 pm
@Matty. The Internet predates the World Wide Web by a couple decades or so.
Matty said,
March 10, 2010 @ 3:44 pm
You sure, by my distinction? ARPANET etc, predate world wide web by a long time, but I would think they are only "an internet", and there were many such internets at the time.
I would agree that today's general usage of "the internet" would also refer to such historical networks, since its an overarching term… "The internet" existed back then, I just never heard it referred to as such, and didn't think the phrase was in usage… the whole premise of this language log article being that the phrase didn't standardize until approximately the 90s?
Mark P said,
March 10, 2010 @ 5:42 pm
SIPRNet (Secret Internet Protocol Router Network) doesn't take the article. It's verbed as sipr: "I'll sipr that to you." When wikipedia spells out the name, it uses "the."
Quite a few government agencies with acronyms don't take an article when referred to by inmates.
Doug Cutting said,
March 10, 2010 @ 6:24 pm
Freeway names are stronger in northern California than southern, where one hears of "the five" and the "the two ten". Speaking of freeway names, is my dad, a N. CA native, the only one who calls it "a hunnert 'n one"?
Simon Musgrave said,
March 10, 2010 @ 6:47 pm
Should we posit a category of 'super-strong' proper nouns for the ones which insist on having 'The' with a capital? In my neck of the woods, The University of Melbourne likes this style, leading to some laughter from people at other institutions in town, and the organisation now known as Opera Australia was formerly The Australian Opera to the extent that when I worked for them in the 1980s we used TAO as the abbreviation.
Do people actually have a perception that a capitalised article adds prestige somehow?
Ellen K. said,
March 10, 2010 @ 6:49 pm
Okay, so the Internet is older than the term "internet". Still, the term internet goes back to 1985 or 1986, before the web, and I don't know the history of it coming into common usage, but I recall it always including email, which was, back then, mostly separate from the web.
Fred said,
March 10, 2010 @ 6:52 pm
Well, if you travel back and forth from Manhattan to the Bronx enough times you'll understand… Um, how to get around New York City, and if you're mentally quick, you might get to understand a new language, but you probably won't understand more about definite articles. Or even articles on linguistics, from Internet.
Brian said,
March 10, 2010 @ 8:13 pm
No. The Internet is not a destination. It is a means. You don't get in your car because you want to visit the interstate; you get on the interstate because you want to go to California.
David said,
March 10, 2010 @ 8:46 pm
In 1994, Swedish Prime Minister Carl Bildt sent an e-mail to Bill Clinton and his Vietnamese colleague, to congratulate them on the lifting of the US trade embargo against Vietnam. These were the first e-mails ever sent between heads of government. In his response (which at some point could be found on http://www.bildt.net but not anymore), the Vietnamese president (or his translator) consistently wrote INTERNET in block letters.
In Swedish, incidentally, "Internet" is treated as a proper name. Halfhearted attempts by language planners to get people to say "internät", which would require the definite form "internätet" when you talk about "the Internet", have not come to fruition. (Unlike Danish, where this has already happened, it would require the lengthening of the final vowel and a change from accent 1 to accent 2.) The same thing with Facebook, though there the hypocoristic "Fejan" (from "feja", old Stockholm slang for "face", borrowed from English) is always in the definite.
Julie said,
March 10, 2010 @ 9:30 pm
Doug: If your father's from north of the Bay Area, it's likely that people in his town just call it "the highway," or maybe "the freeway." That leaves a lot of room for individual usage on those occasions when it's necessary to be more specific. I grew up on 1, not 101 (both proper names), and certainly it was just "the highway (common noun)." Most northern California towns are on a single or dominant highway, so that's not ambiguous.
In cities, there are local names, like San Francisco's Bayshore Freeway. All that said, I know it as "one-oh-one"….but I wouldn't put it past my father to express it exactly as yours does.
Jean said,
March 10, 2010 @ 10:46 pm
Would some examples from 1989 be interesting? In January, 1989, Bryan Koch, who was then president of the ACM, wrote about "computers attached to Internet"; in June, EH Spafford, discussing the Internet worm released the previous year wrote "…the Internet was attacked…".
[(myl) From the Chicago Tribune, Jan 23, 1990: "The worm he designed immobilized an estimated 6000 computers linked to Internet, including ones at NASA, military facilities and major universities."
Or Vin McLellan, "Data Network to Use Code to Insure Privacy", NYT 3/20/1989:
]
Garrett Wollman said,
March 11, 2010 @ 12:25 am
@myl: I'm not sure that the press clippings are especially relevant, except perhaps as illustration that the reporters (or their copyeditors) did not understand the distinction that was being made for "the Internet". Certainly the examples you cite should signs of failing to comprehend more than just the name of the network. (That link is broken, by the way, so I can't easily check the original article. Your quotation makes me wonder just WTF the reporter could be talking about.)
In Internet Experiment Note #2 (IEN 2), Jon Postel writes "An analogy may be drawn between the internet situation and the ARPANET." This is from 1977; the Network Working Group was working on the transition from NCP (the Network Control Protocol, the underlying network-layer protocol of the original ARPANET) to something newer which would work across networks. In this document, one of the foundational documents in the history of the Internet, Postel makes the argument that an "internet protocol" should be interposed between the link layer and the end-to-end services provided by TCP. This idea developed into the Internet Protocol, proper noun, or IP; by January, 1983, TCP and IP together replaced NCP on the ARPANET.
In IEN 32, John Davidson introduces the term "catenet" for the general class of networks-of-networks ("internet" was and is a more restrictive class of networks-of-networks that follow Dave Clark's famous [but described later] "hourglass model"). I note that OED2 does not note it (I shall have to send Margot Charlton an email about it.)
John Cowan said,
March 11, 2010 @ 1:10 am
United States of America is also officially strong, but is weak in ordinary speech and writing.
My view is that weak names are applied ab origine to bodies of water, to regions of indefinite extent, and to names (like the North Island in N.Z.) that were originally merely descriptive. Weak names, however, may be transferred to other kinds of things, and the connection may be lost.
stormboy said,
March 11, 2010 @ 8:37 am
@John: "A few cases where your rule-of-thumb doesn't work: anything of the form "X Mountain," "X Hill," or "X Valley" (where X is a variable); King of Prussia (a city in Pennsylvania)."
How about 'the Hunter Valley' in NSW, Australia? To me it sounds odd without the definite article. Or is that not what you meant?
Lane said,
March 11, 2010 @ 9:27 am
I agree with Geoff that the distinction (Vanuatu vs the Solomons) probably arises for no reason. I'm wondering though if we then think about the strong versus weak nouns differently.
Countries that have historically been "the _____" have, it seems, tried to ditch the "the". Some of them were colonized, or quasi – The Gambia, The Yemen, The Ukraine. All seek to be known simply as Gambia, Yemen and Ukraine these days. It's like they see the "the" as demeaning.
Interestingly, Arabic does about a half-and-half mix that one has to learn by rote: al-Yemen, al-Iraq, al-Urdun (Jordan), al-Jaza'ir (Algeria), al-Magreb (Morocco), al-Sa'udia, al-Kuwait, al-Sudan, but Suriya, Lubnan, Misr (Egypt), Libya, Dubai, Tunis, Oman, Bahrain…. No idea why.
Army1987 said,
March 11, 2010 @ 11:07 am
@Licia: I had only heard that when there also was a possessive, e.g. il mio facebook meaning "my Facebook account" or "my Facebook profile". (Where are you from? I have the impression that proper names with article are more common in northern Italy than here.)
Johanne D said,
March 11, 2010 @ 11:09 am
As EN-FR translators, we had to determine this sort of thing and I remember there was a clear explanation as to why we should use Internet (proper noun) rather than l'Internet. It's the first I hear of strong and weak proper nouns – I don't know if that distinction exists in French. Here is what l'Office québécois de la langue française has to say on the matter of capitalization, and by extension the definite article:
"c'est une simple question de point de vue : si l'on considère Internet comme une entité unique (nom propre), on choisit la majuscule, et si l'on considère l'internet comme un média parmi d'autres (la télévision, la radio, la presse, etc.), on choisit la minuscule (nom commun)." http://www.olf.gouv.qc.ca/ressources/bibliotheque/dictionnaires/internet/fiches/8361867.html
Army1987 said,
March 11, 2010 @ 4:28 pm
Don't you say le Québec with an article but Paris with no article?
stormboy said,
March 11, 2010 @ 6:25 pm
@Army1987: "Don't you say le Québec with an article but Paris with no article?"
With the article, "Québec" refers to the province; without, it refers to the city. Paris doesn't take the definite article.
Maureen said,
March 12, 2010 @ 11:29 am
I always thought "the Internet" was like "the ocean". You get on it and go places, and there are lots of islands in it, but it's certainly a place. And I still capitalize it, and probably always will.
KevinM said,
March 12, 2010 @ 3:18 pm
In high school, we learned about "The Gambia." One student asked, "Are the people who live there The The Gambians?"
Peter Harvey said,
March 12, 2010 @ 4:40 pm
The diplomatic Conventions that were signed in The Hague at the end of the 19th century are known as the Hague Conventions not the The Hague Conventions. At least, that is the, errr, convention.
Aaron Davies said,
March 12, 2010 @ 11:30 pm
i dug into this last time it came up, and found a military standard from the early 80's that used "internet", lowercase, in the sense of "a connection between networks", and only used "the internet" to mean "the internet currently under discussion". i would assume that this is sense used by postel in IEN 2.
der said,
March 13, 2010 @ 8:36 am
Slightly off-topic, but related: what is Apple trying to achieve with their article-less iPhone policy? (As in "use iPhone to read your mail".) They seem to be assuming that there is some semantic effect? Or maybe they just want to be different.
Will Thompson said,
March 14, 2010 @ 2:03 pm
When Google announced that they might quit China, I noticed that English language statements from Chinese spokespeople consistently used “Internet” without ”the”. This article, for instance, contains the statement “China has tried creating a favorable environment for Internet”.
sharon said,
March 15, 2010 @ 4:17 pm
I should look this up on Separated by a Common Language, where I'm pretty sure I've seen it discussed – "hospital" in British English vs "the hospital" in American English. Eg, we'd say "my mum's in hospital" and Americans would say "she's in the hospital".
Scott said,
April 7, 2010 @ 10:22 pm
ARPAnet, UUNET, BITNET is/were specific things, like I-5 or Route 66.
The internet is a collection of things, like the highway system. The only thing that holds it together is an agreement about addressing and routing packets. Those packets could go anywhere, over any kind of network, translated and mangled and repackaged any way one might dream up. And there could certainly be more than one, but since the whole point of the Internet Protocol is to connect dissimilar networks, it tends to have that effect almost universally.
The Internet is, to my mind, an anomalous capitalization.