Archive for Languages

How many languages?

From a Globe and Mail story about the census in India (hat tip to Michael Kaan):

India concluded its national census this week, having tallied up some 1.2 billion souls, and the last night of counting focused on homeless people – of whom there are an estimated 150,000 in Delhi alone. Getting them into the count was just one in an array of staggering challenges: how to enumerate in the dozen areas under control of various armed rebel movements, and in the 572 tiny islands that make up Andaman and Nicobar; how to train 2.5 million enumerators and handle answers in 6,661 languages.

Whoa! 6,661 languages? The Ethnologue site says it has information about the 6,909 "known living languages" in the world, and lists only 438 living languages for India (for comparison: it lists 176 living languages for the United States, 86 for Canada, and 12 for the United Kingdom).

But if you look at the entries in the Ethnologue, you'll see that most languages have alternative names (sometimes a lot of them) and most languages have recognized dialects listed (sometimes a lot of them). That's probably enough to inflate the language count by more than one order of magnitude. (It's also true that "immigrant languages" — for India, the site mentions Armenian, Burushaski, Judeo-Iraqi Arabic, Northern Pashto, Uighur, Walungge, and Western Farsi — aren't included in the count, but they're probably a small contribution to the problems of the national census of India.)

So it all depends on how you count.

Comments (11)

New search service for language resources

It has just become a whole lot easier to search the world's language archives.  The new OLAC Language Resource Catalog contains descriptions of over 100,000 language resources from over 40 language archives worldwide.

This catalog, developed by the Open Language Archives Community (OLAC), provides access to a wealth of information about thousands of languages, including details of text collections, audio recordings, dictionaries, and software, sourced from dozens of digital and traditional archives.

OLAC is an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources by: (i) developing consensus on best current practice for the digital archiving of language resources, and (ii) developing a network of interoperating repositories and services for housing and accessing such resources.  The OLAC Language Resource Catalog was developed by staff at the Linguistic Data Consortium, the University of Pennsylvania Libraries, the Graduate Institute of Applied Linguistics, and the University of Melbourne.  The primary sponsor is the National Science Foundation.

Comments (2)

Sinitic and Tibetic

In a discussion we were having about the Tibetan evidential particle yin, Nathan Hill sent me an article by Nicholas Tournadre entitled "Arguments against the Concept of 'Conjunct' / 'Disjunct' in Tibetan" from Chomolangma, Demawend und Kasbek, Festschrift für Roland Bielmeier (2008), 281-308.  As I started reading through the article with the hope of finding how yin functions as a sort of equational verb or copula, I was caught up short by some preliminary remarks about the classification of Tibetan that Tournadre makes at the beginning of his paper.

Based on his 20 years of field work throughout the Tibetan language area and on the existing literature, Tournadre estimates that there are 220 "Tibetan dialects" derived from Old Tibetan and currently distributed across five countries:  China, India, Bhutan, Nepal, and Pakistan.  In a forthcoming work, Tournadre states that these "dialects" may be classed within 25 "dialect groups," i.e., groups that do not permit mutual intelligibility.  According to Tournadre, the notion of "dialect group" is equivalent to the notion of "language," but does not entail standardization.  Consequently, says Tournadre, if the concept of standardization is set aside, it would be more appropriate to speak of 25 languages derived from Old Tibetan rather than 25 "dialect groups."

Read the rest of this entry »

Comments (32)

In the NYT

In the January 25 New York Times, two items that caught my eye:

First, a front-page piece on the Tohono O'odham Nation of southern Arizona: "In Drug War, Tribe Feels Invaded by Both Sides" (by Erik Eckholm). The tribe is pressed by drug smugglers and by federal agents, a combination that has made their lives difficult indeed.

Linguists will recognize the group as the people formerly known as the Papago (a name given them by unfriendly outsiders), whose (Uto-Aztecan) language is familiar to linguists through the work of the late Ken Hale and his student Ofelia Zepeda. Reading about the trials of the Tohono O'oodham is like hearing distressing news about an old friend.

Read the rest of this entry »

Comments off


What language is this?

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

Hint: it's one that you know.

Read the rest of this entry »

Comments (21)

How many spoken languages? How many computer languages?

Jeff Shaumeyer wrote recently on my Facebook wall to report that

In another facebook conversation a friend said "I read that there have now been more programming languages than spoken languages of all time." Is this even remotely possible?

Mike Geis immediately fixed on one problem with the claim, the problem of counting languages, whether you're counting human languages (spoken or signed) or computer languages. While we were contemplating these well-known issues (sources that attempt to put a number on human languages give a range — things like "5,000 to 10,000" — and the number of languages listed in the Ethnologue go up with each edition; the 15th edition has 6,912 entries, but a new edition will be out soon, and it's bound to have more), Jeff posted that

the friend discovered that he dramatically misremembered the result he was paraphrasing.

(whew!) but returned to the original claim, saying,

Just by orders of magnitude I found it incredible that more computer languages/dialects could have been created in the last hundred years than the total of spoken languages/dialects that had ever been.

Here's the second problem: "of all time" in Jeff's first message, "that had ever been" in the second.

Read the rest of this entry »

Comments off