Archive for October, 2021

rime-cantonese, a Cantonese lexicon for building keyboards and more

The following is a guest post by Mingfei Lau. A short intro about the author:

My name is Mingfei Lau, a member of The Linguistic Society of Hong Kong Jyutping Workgroup. I am a language engineer at Amazon and I work on different projects on Cantonese resource development in my spare time.


Today, Pinyin is undoubtedly the most popular way to type Mandarin. But what about Cantonese? This wasn’t easy until rime-cantonese, the normalized Cantonese Jyutping[1] lexicon appeared. Lo and behold, you can now type Cantonese in Jyutping just like typing Mandarin in Pinyin.

Read the rest of this entry »

Comments (4)

Another early Sinitic disyllabic morpheme: "(unopened) lotus blossom"

I take great pleasure in finding morphemes in early Sinitic that are disyllabic, i.e., neither syllable of which means anything by itself, but acquires meaning only in combination with another morpheme to which it is customarily linked.  I have found hundreds of ancient terms composed of such morphemes and have written about many of them on Language Log ("grape", "coral", "lion", "reindeer", "macaque", "earthworm",  "spider", "phoenix", "sinuous, winding", "awkward", "knot", "pimple", "balloon lute", "harp", and so on and so forth).

There are two main reasons why I pay particular attention to such disyllabic morphemes:

1. Their numerousness certifies that early Sinitic was not exclusively monosyllabic (a widespread misconception), if we go by its Sinographic form in the latter part of the first millennium BC.

2. Many of these disyllabic morphemes have cognates (i.e., originate) in non-Sinitic languages (e.g., Iranian, Tocharian), which shows that Sinitic language (and culture) did not develop in isolation, but evolved in close association with other languages and cultures.

Read the rest of this entry »

Comments (7)

Massive long-term data storage

News release in EurekAlert, Optica (10/28/21):

"High-speed laser writing method could pack 500 terabytes of data into CD-sized glass disc:  Advances make high-density, 5D optical storage practical for long-term data archiving"

Caption

Researchers developed a new fast and energy-efficient laser-writing method for producing nanostructures in silica glass. They used the method to record 6 GB data in a one-inch silica glass sample. The four squares pictured each measure just 8.8 X 8.8 mm. They also used the laser-writing method to write the university logo and mark on the glass.

Credit

Yuhao Lei and Peter G. Kazansky, University of Southampton

Source

Read the rest of this entry »

Comments (9)

Mixed Mandarin-Taiwanese-Japanese orthography

Comments (4)

Difficult languages and easy languages, part 3

There may well be a dogma out there stating that all languages are equally complex, but I don't believe it, especially not if it has to be "drummed" into our minds.  I have learned many languages.  Some of them are exceedingly hard (because of their complexity) and some of them are relatively easy (because they are comparatively simple).  I have often said that Mandarin is the easiest language I ever learned to speak, but the hardest to read and write in characters (though very easy in Romanization).  And remember these posts:

"Difficult languages and easy languages" (3/4/17)

"Difficult languages and easy languages, part 2" (5/28/19)

Read the rest of this entry »

Comments (33)

"Let's go Brandon!"

ICYMI: Heather Schwedel, "The Story Behind “Let’s Go Brandon,” the Secretly Vulgar Chant Suddenly Beloved by Republicans", Slate 10/22/2021:

On Thursday, Rep. Bill Posey, a Republican from Florida, ended a speech on the House floor with a curious exclamation: “Let’s go, Brandon!”

Let’s go who now?

Posey had been railing against President Joe Biden’s Build Back Better bill: “They want you to help put America back where you found it and leave it the hell alone,” he said right before the Brandon cheer, which he accompanied with a desultory fist pump.

The expression coming from a sitting member of Congress caused a bit of a stir online. Why? Who’s this Brandon character and what does he have to do with building back, or not building back, America? The simple answer is that he’s a race car driver—but it’s a long story, and who Brandon is actually matters less than what the phrase “Let’s go, Brandon!” means. It’s a euphemism—and its direct translation is “Fuck Joe Biden.”

Read the rest of this entry »

Comments (17)

Sino-Japanese aesthetics and a new mode of translation

[This is a guest post by Ashley Liu]

The following is a new way to translate classical Chinese poetry into Japanese. Recently, some Chinese shows about premodern China have become popular in Japan. The Chinese songs in the shows–written in classical Chinese poetry style–are translated into Japanese and sung by Japanese singers. I am fascinated by how the translation works. As you can see below, the Japanese version has waka aesthetics but keeps the 7-syllable format of Chinese poetry. The Japanese version seems to reduce the original meaning by a lot, but if you read it carefully, the way it captures the core meaning is ingenious, e.g., 風中憶當初 (remembering the past in the wind) = 時渡る風 (wind that crosses through time / brings back time).

Read the rest of this entry »

Comments (5)

Ask Language Log: "England's death bowling superhero"?

RW writes "I'm English and have some understanding of cricket, but this one has got me beaten!"

He's referring to a Halloween-y headline: Barney Ronay, "Tymal Mills answers bat signal to be England's death bowling superhero",  The Guardian 10/27/2021.

The syntax of that headline is fairly straightforward. And presumably RW can decode the literary reference involved in answering a bat signal, despite the referential overlap between willow-wood blades and mammals of the order Chiroptera. So the puzzle is, what's "death bowling"?

Read the rest of this entry »

Comments (20)

The implications of Chinese for AI development, part 2

With this post, we are already acquainted with Inspur's Yuan 1.0, "one of the most advanced deep learning language models that can generate coherent Chinese texts."  Now, with the present article, we will delve more deeply into the potentials and pitfalls of Inspur's deep learning language model:

"Inspur unveils GPT-3 equivalent for Chinese language", by Wei Sheng, TechNode (1026/21)

The model is trained with 245.7 billion parameters—the number of weights in an artificial neural network, according to the company. This is more than the Elon Musk-backed GPT-3 language model for English, which has 175 billion parameters. Inspur said the Yuan model was trained with 5 terabytes of datasets.

Read the rest of this entry »

Comments (4)

Robotic anaerobic Rodak erotic rotisserie

In yesterday's "Lively Blind Men" post, Ben Zimmer was appropriately amused by Zoom's speech-to-text mis-recognition of Lila Gleitman's name. But as everyone now has opportunities to learn, speech-to-text systems continue to make strange (and often amusing) mistakes in transcribing words and phrases that they haven't been trained to recognize. There are plenty of examples in pretty much any automatic transcription, and the 10/26 edition of the "Spectacular Vernacular podcast", which Ben co-hosts with Nicole Holliday, doesn't disappoint.

Read the rest of this entry »

Comments (6)

"Linguistician"?

Helen Barrett, "‘Ça plane pour moi’ was a burst of Belgian punk with a dark twin", Financial Times 6/1/2020 [emphasis added]:

Meanwhile, the perennially lucrative “Ça plane pour moi” may not be all that it seems. Bertrand mimed it in TV studios, but whose is the bratty voice on the record?

It is a question that has been the subject of several court cases. Bertrand initially insisted it was him, then changed his story, telling a newspaper in 2010 that he did not sing on the track, despite being credited. During a court case that same year over royalties, a Belgian judge commissioned a linguistician to examine the original. Expert evidence suggested the true vocalist was of northern French origin. Deprijck, who has claimed to be the real vocalist, is from northern France.

Read the rest of this entry »

Comments (22)

Lively Blind Men

Last weekend, there was a memorial service at Penn for  Lila Gleitman, who passed away in August. The hundreds of people physically present were joined by a large crowd on Zoom, where the automatic closed captioning was turned on. And so the audience got to see a large sample of speech-to-text versions of Lila's name, of which this was my favorite:

(Click the picture for a larger version with more context…)

Read the rest of this entry »

Comments (2)

The implications of Chinese for AI development

New article in EnterpriseAI (October 21, 2021):

"Language Model Training Gets Another Player: Inspur AI Research Unveils Yuan 1.0",  by Todd R. Weiss

From Pranav Mulgund:

This article introduces an interesting new advance in an artificial intelligence (AI) model for Chinese. As you probably know, Chinese has been long held as one of the hardest languages for AI to crack. Baidu and Google have both been trying for a long time, but have had a lot of difficulty given the complexity of the language. But the company Inspur just came out with a model called Yuan 1.0 that shows significant advances from previous companies' AIs.

Read the rest of this entry »

Comments (5)