The implications of Chinese for AI development, part 2

With this post, we are already acquainted with Inspur's Yuan 1.0, "one of the most advanced deep learning language models that can generate coherent Chinese texts."  Now, with the present article, we will delve more deeply into the potentials and pitfalls of Inspur's deep learning language model:

"Inspur unveils GPT-3 equivalent for Chinese language", by Wei Sheng, TechNode (1026/21)

The model is trained with 245.7 billion parameters—the number of weights in an artificial neural network, according to the company. This is more than the Elon Musk-backed GPT-3 language model for English, which has 175 billion parameters. Inspur said the Yuan model was trained with 5 terabytes of datasets.

Read the rest of this entry »

Comments (4)


Robotic anaerobic Rodak erotic rotisserie

In yesterday's "Lively Blind Men" post, Ben Zimmer was appropriately amused by Zoom's speech-to-text mis-recognition of Lila Gleitman's name. But as everyone now has opportunities to learn, speech-to-text systems continue to make strange (and often amusing) mistakes in transcribing words and phrases that they haven't been trained to recognize. There are plenty of examples in pretty much any automatic transcription, and the 10/26 edition of the "Spectacular Vernacular podcast", which Ben co-hosts with Nicole Holliday, doesn't disappoint.

Read the rest of this entry »

Comments (6)


"Linguistician"?

Helen Barrett, "‘Ça plane pour moi’ was a burst of Belgian punk with a dark twin", Financial Times 6/1/2020 [emphasis added]:

Meanwhile, the perennially lucrative “Ça plane pour moi” may not be all that it seems. Bertrand mimed it in TV studios, but whose is the bratty voice on the record?

It is a question that has been the subject of several court cases. Bertrand initially insisted it was him, then changed his story, telling a newspaper in 2010 that he did not sing on the track, despite being credited. During a court case that same year over royalties, a Belgian judge commissioned a linguistician to examine the original. Expert evidence suggested the true vocalist was of northern French origin. Deprijck, who has claimed to be the real vocalist, is from northern France.

Read the rest of this entry »

Comments (22)


Lively Blind Men

Last weekend, there was a memorial service at Penn for  Lila Gleitman, who passed away in August. The hundreds of people physically present were joined by a large crowd on Zoom, where the automatic closed captioning was turned on. And so the audience got to see a large sample of speech-to-text versions of Lila's name, of which this was my favorite:

(Click the picture for a larger version with more context…)

Read the rest of this entry »

Comments (2)


The implications of Chinese for AI development

New article in EnterpriseAI (October 21, 2021):

"Language Model Training Gets Another Player: Inspur AI Research Unveils Yuan 1.0",  by Todd R. Weiss

From Pranav Mulgund:

This article introduces an interesting new advance in an artificial intelligence (AI) model for Chinese. As you probably know, Chinese has been long held as one of the hardest languages for AI to crack. Baidu and Google have both been trying for a long time, but have had a lot of difficulty given the complexity of the language. But the company Inspur just came out with a model called Yuan 1.0 that shows significant advances from previous companies' AIs.

Read the rest of this entry »

Comments (5)


Writing Mandarin phrases with Roman letter acronyms

Since the vast majority of inputting in the PRC is done via Hanyu Pinyin, netizens are thoroughly familiar with the alphabet and use it regularly as part of the Chinese writing system.

One common usage for the alphabet in the PRC is acronymically to designate frequently encountered Mandarin phrases.  In "The Chinese Internet Slang You Need to Know in 2021", CLI (10/19/21), Anias Stambolis-D'Agostino introduces six popular online acroyms:

1. yyds (永远的神)

永远的神 (yǒngyuǎn de shén; yyds) means “eternal God” and describes an outstanding person or thing. It's similar to the saying GOAT (Greatest of All Time) in English. The phrase is often used by fans to praise their idols or simply to describe something they are fond of.

For example:

    • 桂林米粉太好吃了,桂林米粉就是yyds!
    • Guìlín mǐfěn tài hàochī le, Guìlín mǐfěn jiùshì yyds.
    • Guilin rice noodles are delicious, they’re just yyds!

Here's another example:

    • 李小龙的中国功夫太厉害了,他就是yyds!
    • Lǐxiǎolóng de Zhōngguó gōngfū tài lìhài le, tā jiùshì yyds
    • Bruce Li’s kung fu skills are so good, he’s such a yyds!

Read the rest of this entry »

Comments (4)


Confucius didn't mean that

We often encounter fake "Oriental wisdom" that purports to come from the ancient sages.  So much of it clogs the internet that it is very hard to keep track of what is genuine and what is false.  And then there's the (in)famous pseudo-linguistics of the "Crisis = danger + opportunity" trope which has captured the occidental imagination.

Another type of distortion and misinformation concerning Chinese thought are actual quotations of an ancient sage's words that are misused and misinterpreted to imply something other than what was originally intended.

In "Things Confucius Never Said", The World of Chinese (10/9/21), Sun Jiahui has assembled a group of five such abused quotations attributed to Confucius.  Since she has done such a superb job of presenting them, I will make only minor adaptations in giving them here.

Read the rest of this entry »

Comments (9)


Your Pinky Heart

Phenomenally viral song by the Malaysian hip-hop artist, Namewee, "It might Break Your Pinky Heart. Namewee 黃明志 Ft.Kimberley Chen 陳芳語【Fragile 玻璃心】@鬼才做音樂 2021 Ghosician" — premiered on 10/15/21, and it already has nearly 9,000,000 views:

Read the rest of this entry »

Comments (20)


Dubbing and subtitles

From an anonymous correspondent:

G and I have always enjoyed foreign films, but only if they're subtitled. We shy away from films that are dubbed into English. The dubbing clearly adds another layer of clumsy artifice that stops me from entering into the film.

The Italians, and I believe most Europeans, prefer dubbing when they're watching foreign films. Their voice actors are a highly-paid group. A few years ago, when the Italian dubbers went on strike, no new foreign (i.e., American, British, French, etc.) films were released for months, maybe years.

Read the rest of this entry »

Comments (39)


Speak not: dying languages

In Asian Review of Books (10/20/21), Peter Gordon reviews James Griffiths' Speak Not: Empire, Identity and the Politics of Language (Bloomsbury, October 2021).  Although the book touches upon many other languages, its main focus is on Welsh, Hawaiian, and Cantonese.

That Speak Not is more politics than linguistics is telegraphed by the title. For Griffiths, language is the single most important aspect of group identity, both as marker and glue: that what makes the Welsh Welsh or Hawaiians Hawaiian is primarily the language, rather than lineage, culture, belief systems or lifestyles. While some might debate this, governments have all too often taken aim at minority languages with precisely this rationale in the name of national unity.

Read the rest of this entry »

Comments (25)


Judo: martial arts neologism or ancient philosophical term?

The term "judo", which sport / martial art ("as a physical, mental, and moral pedagogy" [source]) was only created in 1882 by Jigoro Kano 嘉納治五郎 (1860-1938).  What I find amazing is that jūdō / MSM róudào 柔道 ("soft / flexible / gentle / supple / mild / yielding way") comes right out of the Yìjīng 易經 (Book / Classic of Change[s]).  Of course, traditional Japanese scholars have always been learned in the Chinese classics, so it shouldn't be too surprising that they would draw on the classics for terminology and ideas that had great meaning for them.  But I'm curious whether Jigoro Kano explicitly referred to the Yìjīng in any of his writings about jūdō 柔道.

Read the rest of this entry »

Comments (15)


All-purpose word for "glamorous woman"

The PRC authorities have always policed human behavior and thought, but especially during the last half year or so and particularly toward young people, for whatever reason, they have been coming on more gangbusters than usual.

First they went after the phenomenon of tǎngpíng 躺平 ("lying flat"), i.e., those who chose to opt out of the cutthroat rat race.  Then they chastised niángpào 娘炮 ("effeminate men"), i.e., girlie boys and men.  The social minders even drew a bead on jīngfēn 精芬  for socially averse millennials who identified themselves as spiritually Finnish.  These were serious efforts to squelch such unwanted tendencies in the population.  Now they have taken quite a different turn and are aiming at an altogether different target:  beautiful women.  Strange to say, they are coupling this campaign against female pulchritude with a crusade against Buddhism.

Well, it's not that strange after all, since communism has never been fond of religion, and Buddhism has often been persecuted by Chinese regimes, almost from the time of its arrival in the Middle Kingdom nearly two millennia ago.  Even the combination of feminine beauty and Buddhism reveals a certain psychological fixation on the baldness and celibacy of nuns in traditional Chinese society.

Read the rest of this entry »

Comments (39)


Oont ze knakkers

Comments (25)