Archive for Languages

Too much Victor Mair

I've been reading way too much Victor Mair. In the restaurant of my hotel in London I just saw an English girl wearing a T-shirt on which it said this:


And I immediately thought, who is Ho Pe?

Read the rest of this entry »

Comments off

English and Mandarin juxtaposed

Comments (23)

No word for "privacy" in Russian?

Reader and fan Will Thompson wrote to Mark Liberman, who passed his letter on to me, about a recent article by Ellen Barry in The New York Times, discussing a book by the Russian political analyst Nikolai V. Zlobin in which he explains weird/different American cultural norms to Russians.

Will notes that towards the end, the reviewer states:

He [Zlobin] devotes many pages to privacy, a word that does not exist in the Russian language[.]

And Will is suspicious of that claim.

Read the rest of this entry »

Comments (49)

High school language exams in students' native languages

High school principals in the UK are discovering that immigrants can be a very useful resource for them. Schools are rated according to the number of passes their students obtain in the General Certificate of Secondary Education (GCSE). There are 619,000 immigrants from Poland now living in the country, and Polish is available as a GCSE examination subject.

Polish is now the 5th most popular language to take at GCSE level. And 95% of those taking it gain one of the top 3 of the 9 grades (a much higher percentage than for languages like French or Spanish). Moreover, 97% of those who take Polish score worse on the English Language exam. The inference to draw is clear, and very probably true: schools are pushing Polish native speakers to take the exam, because it pushes up the school's GCSE rating.

Read the rest of this entry »

Comments (37)

How many languages?

From a Globe and Mail story about the census in India (hat tip to Michael Kaan):

India concluded its national census this week, having tallied up some 1.2 billion souls, and the last night of counting focused on homeless people – of whom there are an estimated 150,000 in Delhi alone. Getting them into the count was just one in an array of staggering challenges: how to enumerate in the dozen areas under control of various armed rebel movements, and in the 572 tiny islands that make up Andaman and Nicobar; how to train 2.5 million enumerators and handle answers in 6,661 languages.

Whoa! 6,661 languages? The Ethnologue site says it has information about the 6,909 "known living languages" in the world, and lists only 438 living languages for India (for comparison: it lists 176 living languages for the United States, 86 for Canada, and 12 for the United Kingdom).

But if you look at the entries in the Ethnologue, you'll see that most languages have alternative names (sometimes a lot of them) and most languages have recognized dialects listed (sometimes a lot of them). That's probably enough to inflate the language count by more than one order of magnitude. (It's also true that "immigrant languages" — for India, the site mentions Armenian, Burushaski, Judeo-Iraqi Arabic, Northern Pashto, Uighur, Walungge, and Western Farsi — aren't included in the count, but they're probably a small contribution to the problems of the national census of India.)

So it all depends on how you count.

Comments (11)

New search service for language resources

It has just become a whole lot easier to search the world's language archives.  The new OLAC Language Resource Catalog contains descriptions of over 100,000 language resources from over 40 language archives worldwide.

This catalog, developed by the Open Language Archives Community (OLAC), provides access to a wealth of information about thousands of languages, including details of text collections, audio recordings, dictionaries, and software, sourced from dozens of digital and traditional archives.

OLAC is an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources by: (i) developing consensus on best current practice for the digital archiving of language resources, and (ii) developing a network of interoperating repositories and services for housing and accessing such resources.  The OLAC Language Resource Catalog was developed by staff at the Linguistic Data Consortium, the University of Pennsylvania Libraries, the Graduate Institute of Applied Linguistics, and the University of Melbourne.  The primary sponsor is the National Science Foundation.

Comments (2)

Sinitic and Tibetic

In a discussion we were having about the Tibetan evidential particle yin, Nathan Hill sent me an article by Nicholas Tournadre entitled "Arguments against the Concept of 'Conjunct' / 'Disjunct' in Tibetan" from Chomolangma, Demawend und Kasbek, Festschrift für Roland Bielmeier (2008), 281-308.  As I started reading through the article with the hope of finding how yin functions as a sort of equational verb or copula, I was caught up short by some preliminary remarks about the classification of Tibetan that Tournadre makes at the beginning of his paper.

Based on his 20 years of field work throughout the Tibetan language area and on the existing literature, Tournadre estimates that there are 220 "Tibetan dialects" derived from Old Tibetan and currently distributed across five countries:  China, India, Bhutan, Nepal, and Pakistan.  In a forthcoming work, Tournadre states that these "dialects" may be classed within 25 "dialect groups," i.e., groups that do not permit mutual intelligibility.  According to Tournadre, the notion of "dialect group" is equivalent to the notion of "language," but does not entail standardization.  Consequently, says Tournadre, if the concept of standardization is set aside, it would be more appropriate to speak of 25 languages derived from Old Tibetan rather than 25 "dialect groups."

Read the rest of this entry »

Comments (32)

In the NYT

In the January 25 New York Times, two items that caught my eye:

First, a front-page piece on the Tohono O'odham Nation of southern Arizona: "In Drug War, Tribe Feels Invaded by Both Sides" (by Erik Eckholm). The tribe is pressed by drug smugglers and by federal agents, a combination that has made their lives difficult indeed.

Linguists will recognize the group as the people formerly known as the Papago (a name given them by unfriendly outsiders), whose (Uto-Aztecan) language is familiar to linguists through the work of the late Ken Hale and his student Ofelia Zepeda. Reading about the trials of the Tohono O'oodham is like hearing distressing news about an old friend.

Read the rest of this entry »

Comments off


What language is this?

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

Hint: it's one that you know.

Read the rest of this entry »

Comments (21)

How many spoken languages? How many computer languages?

Jeff Shaumeyer wrote recently on my Facebook wall to report that

In another facebook conversation a friend said "I read that there have now been more programming languages than spoken languages of all time." Is this even remotely possible?

Mike Geis immediately fixed on one problem with the claim, the problem of counting languages, whether you're counting human languages (spoken or signed) or computer languages. While we were contemplating these well-known issues (sources that attempt to put a number on human languages give a range — things like "5,000 to 10,000" — and the number of languages listed in the Ethnologue go up with each edition; the 15th edition has 6,912 entries, but a new edition will be out soon, and it's bound to have more), Jeff posted that

the friend discovered that he dramatically misremembered the result he was paraphrasing.

(whew!) but returned to the original claim, saying,

Just by orders of magnitude I found it incredible that more computer languages/dialects could have been created in the last hundred years than the total of spoken languages/dialects that had ever been.

Here's the second problem: "of all time" in Jeff's first message, "that had ever been" in the second.

Read the rest of this entry »

Comments off