Language Log

A curse from the novel coronavirus epicenter

April 21, 2020 @ 3:23 pm · Filed by Victor Mair under Swear words, Topolects

The whole world is now thoroughly familiar with the name "Wuhan", whereas four months ago, only a small number of people outside of China would have heard of it. Since, two days ago, I posted about Dutch curses, many of which just so happen to be linked to diseases, I am prompted to dust off an old post that is about a colorful curse from Wuhan, which, by the way, is famous among all Chinese cities for the proclivity of its inhabitants to indulge in sharp-tongued imprecations at the slightest provocation. I myself have been witness to their talent in this art, at which the women are especially adept.

Read the rest of this entry »

Permalink Comments (4)

"Crisis = danger + opportunity" in America and in PRC official media

April 21, 2020 @ 6:40 am · Filed by Victor Mair under Errors, Etymology, Language and politics, Memes, Tropes

From Gillian Hochmuth:

Thank you for your great explanation of the reasons behind the famous Kennedy "crisis" misquote. When I was in high school, I had a friend who was Chinese and spoke Mandarin fluently, who explained it to my US History class after the teacher quoted Kennedy. That was over 20 years ago and I remembered that his quote was wrong, but could not remember the explanation I was given well enough to explain it to someone else.

Read the rest of this entry »

Permalink Comments (5)

Chinese transcriptions of Indic terms in Buddhist translations of the 2nd c. AD

April 20, 2020 @ 7:43 am · Filed by Victor Mair under Borrowing, Historical linguistics, Language and computers, Phonetics and phonology, Reconstructions

A fuller and more specific version of the title of this post would be "Chinese transcriptions of Indic terms in the translations of An Shigao (Chinese: 安世高; pinyin: Ān Shìgāo; Wade–Giles: An Shih-kao, Korean: An Sego, Japanese: An Seikō, Vietnamese: An Thế Cao) (fl. 148-180 CE) and Lokakṣema (लोकक्षेम, Chinese: 支婁迦讖; pinyin: Zhī Lóujiāchèn) (fl. 147-189)".

With the collaboration of Jan Nattier, Nathan Hill was able to digitize some data from Han Buddhist transcriptions back in 2017 and has now published them as a dataset on Zenodo:

Hill, Nathan, Nattier, Jan, Granger, Kelsey, & Kollmeier, Florian. (2020). Chinese transcriptions of Indic terms in the translations of Ān Shìgāo 安世高 and Lokakṣema 支婁迦讖 [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3757095

Read the rest of this entry »

Permalink Comments (5)

Dutch curses

April 19, 2020 @ 7:49 am · Filed by Victor Mair under Language and medicine, Swear words

An article in The Economist has two titles in different editions, both datelined March 26, 2020 Amsterdam:

Typhus off!
"Why Dutch swear words are so poxy
English insults often refer to sex; Dutch ones, to disease"
Swearing
"Dutch disease
A country where sicknesses are curses"

The content is the same:

Read the rest of this entry »

Permalink Comments (51)

New word of the week

April 19, 2020 @ 6:00 am · Filed by Mark Liberman under Linguistics in the comics

From today's SMBC:

Read the rest of this entry »

Permalink Comments (15)

No music on Twitter?

April 18, 2020 @ 12:09 pm · Filed by Mark Liberman under WTF

David Brooks is working hard to maintain his reputation for always being wrong about things that are easy to check:

If you lived your life on Twitter you would never know music existed.

— David Brooks (@nytdavidbrooks) April 18, 2020

Read the rest of this entry »

Permalink Comments (11)

Lexical display rates in novels

April 18, 2020 @ 11:46 am · Filed by Mark Liberman under Computational linguistics

In some on-going research on linguistic features relating to clinical diagnosis and tracking, we've been looking at "lexical diversity". It's easy to measure the rate of vocabulary display — you can just use a type-token graph, which shows the count of distinct words ("types") against the count of total words ("tokens"). It's less obvious how to turn such a curve into a single number that can be compared across sources — for a survey of some alternative measures, see e.g. Scott Jarvis, "Short texts, best-fitting curves and new measures of lexical diversity", Language Testing 2002; and for the measure that we've settled on, see Michael Covington and Joe McFall, "Cutting the Gordian knot: The moving-average type–token ratio (MATTR)", Journal of quantitative linguistics 2010. More on that later.

For now, I want to make a point that depends only on type-token graphs. Over time, I've accumulated a small private digital corpus of more than 100 English-language fiction titles, from Tristram Shandy forward to 2019. It's clear that different authors have different characteristic rates of vocabulary display, and for today's post, I want to present the authors in my collection with the highest and lowest characteristic rates.

Read the rest of this entry »

Permalink Comments (5)

The impact of COVID-19 on Russian

April 18, 2020 @ 7:05 am · Filed by Victor Mair under Language and medicine, Words words words

Yevgeny Basovskaya, a specialist on public speech at Moscow’s State University of the Humanities, says that the disease has had a "radical" influence on the way Russians speak their language. This begins with the word coronavirus, which has an "a" in the middle. This is in "in complete violation of Russian orthographic rules".

Paul Goble, "Coronavirus has Radically Affected the Language Russians Speak, Basovskaya Says", Window on Eurasia — New Series (4/17/20)

It should by rights be an “o” but it isn’t and so feels alien for that reason alone

Source

Read the rest of this entry »

Permalink Comments (35)

Chinese: what do you hear?

April 17, 2020 @ 1:23 pm · Filed by Victor Mair under Intelligibility, Language and the movies, Tones, Topolects

[This is a guest post by Jonathan Smith]

Here's an audio passage from a film I've been watching:

If you know Chinese, test yourself to see how much of it you understand.

Read the rest of this entry »

Permalink Comments (8)

"Racist dog whistling"

April 15, 2020 @ 3:44 pm · Filed by Victor Mair under Idioms, Language and the media

News brief on the (Australian) ABC website:

Read the rest of this entry »

Permalink Comments (24)

Cumulative punctuation

April 15, 2020 @ 8:47 am · Filed by Mark Liberman under Linguistics in the comics

A recent SMBC:

Read the rest of this entry »

Permalink Comments (34)

"Robust Contact-Rich Manipulation by Controlled Compliance"

April 15, 2020 @ 7:15 am · Filed by Mark Liberman under Ambiguity

Every day, I get several talk announcements from the various mailing lists that I subscribe to, which represent a rich array of disciplinary sources: linguistics, computer science, anthropology, sociology, communications, math, literary studies, marketing, and so on. Usually I can figure out from the title what the presentation is going to be about — but sometimes my first guess is wrong in an interesting way.

Read the rest of this entry »

Permalink Comments (2)

The historical phonology of "Han", the main Chinese ethnonym

April 14, 2020 @ 7:17 am · Filed by Victor Mair under Etymology, Historical linguistics, Names, Phonetics and phonology

[VHM: This is a guest post by Chris Button. It will be primarily of interest to specialists in the phonological history of Sinitic. Since there are quite a few such scholars on Language Log, I expect that it will occasion the usual lively debate that follows posts on such subjects. It will also undoubtedly be of interest to historical phonologists in general, as well as to a broad spectrum of Sinologists and their colleagues focusing on other Asian cultures and languages.]

I've been thinking about the etymological associations of Hàn 漢. It's often reconstructed with an aspirated coronal nasal as *hn-, in spite of the Middle Chinese x- then being somewhat unexpected (Baxter and Sagart put it down to dialects), largely on the basis of the *n- in 難. But its etymological association with 艱 and its velar *k- make this problematic. A regular source of MC x- would be *hŋ- which then at least would be a velar onset to parallel *k-. The *n- in 難 could perhaps be put down to some sort of assimilation of *ŋ- with the *-n coda (one might compare 般 *pán < *pám where there is dissimilation of the coda unlike in its phonetic 凡 *bàm) . At the very least, 漢 most likely went back to something like *hŋáns and then *xáns with a velar onset and the -s eventually becoming qu-sheng. An alternative option is rhinoglottophilia whereby a *ʔ became *n- as attested in cases like 憂 *ʔə̀w and 獶(夒) *nə́w a I mentioned here.

Read the rest of this entry »

Permalink Comments (26)

Language Log

A curse from the novel coronavirus epicenter

"Crisis = danger + opportunity" in America and in PRC official media

Chinese transcriptions of Indic terms in Buddhist translations of the 2nd c. AD

Dutch curses

New word of the week

No music on Twitter?

Lexical display rates in novels

The impact of COVID-19 on Russian

Chinese: what do you hear?

"Racist dog whistling"

Cumulative punctuation

"Robust Contact-Rich Manipulation by Controlled Compliance"

The historical phonology of "Han", the main Chinese ethnonym

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta