Cossack and Kazakh
At dinner the other night, someone asked whether Cossack and Kazakh are etymological descendants from the same source. The consensus around the table was "probably yes", but no one really knew anything. A bit of internet research supports that conclusion — though no doubt readers will be able to add depth and nuance.
Read the rest of this entry »
“Overcoming the one-inch-tall barrier of subtitles”: The Oscars and multilingualism
Below is a guest post by Tihomir Rangelov.
The Korean film Parasite’s landslide success at the Oscars this year has been called "a cultural breakthrough". Was it a linguistic breakthrough as well?
Read the rest of this entry »
English syllable detection
In "Syllables" (2/24/2020), I showed that a very simple algorithm finds syllables surprisingly accurately, at least in good quality recordings like a soon-to-published corpus of Mandarin Chinese. Commenters asked about languages like Berber and Salish, which are very far from the simple onset+nucleus pattern typical of languages like Chinese, and even about English, which has more complex syllable onsets and codas as well as many patterns where listeners and speakers disagree (or are uncertain) about the syllable count.
I got a few examples of Berber and Salish, courtesy of Rachid Ridouane and Sally Thomason, and may report on them shortly. But it's easy to run the same program on a well-studied and easily-available English corpus, namely TIMIT, which contains 6300 sentences, 10 from each of 630 speakers. This is small by modern standards, but plenty large enough for test purposes. So for this morning's Breakfast Experiment™, I tested it.
Read the rest of this entry »
"Andy's chest"
Notice the button on Andy Warhol's jacket:
Source: The Andy Warhol Catalogue Raisonné, vol. 4 (Paintings and Sculptures Late 1974-1976).
Read the rest of this entry »
Winnie the Flu
Tweet from Joshua Wong 黃之鋒, Secretary-General of Demosistō:
Here is Winnie The Flu that we call as #WTF
Credit to Yeahman Tse via Legend Bricks LEGO Forum pic.twitter.com/q04K7QfAku
— Joshua Wong 黃之鋒 😷 (@joshuawongcf) February 24, 2020
Read the rest of this entry »
Syllables
From a physical point of view, syllables reflect the fact that speaking involves oscillatory opening and closing of the vocal tract at a frequency of about 5 Hz, with associated modulation of acoustic amplitude. From an abstract cognitive point of view, each language organizes phonological features into a sort of grammar of syllabic structures, with categories like onsets, nuclei and codas. And it's striking how directly and simply the physical oscillation is related to the units of the abstract syllabic grammar — there's no similarly direct and simple physical interpretation of phonological features and segments.
This direct and simple relationship has a psychological counterpart. Syllables seems to play a central role in child language acquisition, with words following a gradual development from very simple syllable patterns, through closer and closer approximations to adult phonological and phonetic norms. And as Lila Gleitman and Paul Rozin observed in 1973 ("Teaching reading by use of a syllabary", Reading Research Quarterly), "It is suggested on the basis of research in speech perception that syllables are more natural units than phonemes, because they are easily pronounceable in isolation and easy to recognize and to blend."
In 1975, Paul Mermelstein published an algorithm for "Automatic segmentation of speech into syllabic units", based on "assessment of the significance of a loudness minimum to be a potential syllabic boundary from the difference between the convex hull of the loudness function and the loudness function itself." Over the years, I've found that even simpler methods, based on selecting peaks in a smoothed amplitude contour, also work quite well (see e.g. Margaret Fleck and Mark Liberman, "Test of an automatic syllable peak detector", JASA 1982; and slides on Dinka tone alignment from EFL 2015).
In this post, I'll present a simple language-independent syllable detector, and show that it works pretty well. It's not a perfect algorithm or even an especially good one. The point is rather that "syllables" are close enough to being amplitude peaks that the results of a simple-minded, language-independent algorithm are surprisingly good, so that maybe self-supervised adaptation of a more sophisticated algorithm could lead in interesting directions.
Read the rest of this entry »
Facebook Guang Guang Guang Guang translate loop
From Jeff DeMarco:
I hit the translation button for this Facebook post and this is what I got!
Read the rest of this entry »
Chinese coronavirus linguistic war
From a Taiwanese colleague:
In the struggle against Wǔhàn fèiyán 武漢肺炎 ("Wuhan pneumonia"), Taiwan has to fight the war on three fronts: (1) trying to stop the virus at its borders; (2) trying to join the WHO for world-wide collaboration and disease information; and (3) fighting against the Communist Chinese dictatorial linguistic policies. The linguistic policy on disease terminology is really weird; it smacks of George Orwell's 1984.
He cites this article in Chinese and this facebook page (also in Chinese). Here's another article in Chinese from Taiwan that sticks to "Wuhan pneumonia" despite the pressure from WHO and the PRC government to adopt a name that is not transparent with regard to the origin of the disease.
Read the rest of this entry »
Preventive Care for Local Languages
February 21st is International Mother Language Day, proclaimed by the General Conference of UNESCO in 1999 and celebrated every year since, aimed at promoting linguistic and cultural diversity and multilingualism. In honor of the day, the following is a guest post by Alissa Stern, the founder of BASAbali, an initiative of “linguists, anthropologists, students, and laypeople, from within and outside of Bali, who are collaborating to keep Balinese strong and sustainable.” BASAbali won a 2019 UNESCO Award for Literacy and a 2018 International Linguapax Award.
We’re told “Don’t wait” to treat our bodies, secure our homes, or maintain our cars. We should do the same for local languages.
Despite all the years of language revitalization, we are still losing about one language every two to three weeks. In this century alone, the number of languages on the planet will be halved. A little preventive care would help.
Read the rest of this entry »
"Crisis = danger + opportunity" redux
From IAS: Institute for Advanced Study; Report for the Academic Year 2018-2019, p. 8:
Read the rest of this entry »