Regional varieties of Cantonese

We have regional varieties of English:  Australian, American (with many subvarieties), Indian (South Asian), and so forth.  Cantonese is spread all around the world, especially in Southeast Asia, so it is not surprising that it has also developed its own regional variants.  In this post, we will concentrate on a comparison of Hong Kong and Malaysian Cantonese.

"Lost in communication:  Just because we speak Cantonese doesn’t mean we can understand each other", by Mandy Li, The Hong Konger (16 October 2024)

Mandy Li remembers the first time she worked with a Malaysian colleague:

In Malaysia, a sizeable portion of the population have Cantonese heritage so can speak the language. They also enjoy watching Cantonese dramas. So, when my colleague learned I was from Hong Kong, she naturally switched to Cantonese when speaking to me. I was astonished to find that I could not understand everything she said.

Read the rest of this entry »

Comments (9)


Psychotic Whisper

Whisper is a widely-used speech-to-text system from OpenAI — and it turns out that generative AI's hallucination problem afflicts Whisper to a surprisingly serious extent, as documented by Allison Koenecke, Anna Seo Gyeong Choi, Katelyn X. Mei, Hilke Schellmann, and Mona Sloane,"Careless Whisper: Speech-to-Text Hallucination Harms", In The 2024 ACM Conference on Fairness, Accountability, and Transparency,  2024:

Abstract: Speech-to-text services aim to transcribe input audio as accurately as possible. They increasingly play a role in everyday life, for example in personal voice assistants or in customer-company interactions. We evaluate Open AI’s Whisper, a state-of-the-art automated speech recognition service outperforming industry competitors, as of 2023. While many of Whisper’s transcriptions were highly accurate, we find that roughly 1% of audio transcriptions contained entire hallucinated phrases or sentences which did not exist in any form in the underlying audio. We thematically analyze the Whisper-hallucinated content, finding that 38% of hallucinations include explicit harms such as perpetuating violence, making up inaccurate associations, or implying false authority. We then study why hallucinations occur by observing the disparities in hallucination rates between speakers with aphasia (who have a lowered ability to express themselves using speech and voice) and a control group. We find that hallucinations disproportionately occur for individuals who speak with longer shares of non-vocal durations—a common symptom of aphasia. We call on industry practitioners to ameliorate these language-model-based hallucinations in Whisper, and to raise awareness of potential biases amplified by hallucinations in downstream applications of speech-to-text models.

Read the rest of this entry »

Comments (12)


Among the new phrases…

Today's Tank McNamara:

According to the NFL, a "hip drop tackle" "occurs when a defender wraps up a ball carrier and rotates or swivels his hips, unweighting himself and dropping onto ball carrier’s legs during the tackle". And I would have more or less guessed that meaning, before getting the authoritative definition.

A Sandwich Helix, on the other hand…

Read the rest of this entry »

Comments (13)


Zipf's demon

George Kingsley Zipf is famous for his work on the power-law distribution of word frequencies, which has come to be known as Zipf's Law. And he's also known for the related "Law of Abbreviation", and the hypothesized balance between effort and efficacy.

In his 1945 paper "The repetition of words, time-perspective, and semantic balance", Zipf looks at a different distribution, which is much less famous:

In the present study we shall attempt to show in preliminary outline how the rate of repetition of words in the stream of speech may be useful not only in indicating what we shall presently define as "time-perspective" but also in elucidating what we shall presently refer to as "semantic balance" – two terms of potential significance in the understanding of personality variants.

"Personality variants?" Wait for it…

Read the rest of this entry »

Comments (6)


Store sign in Taiwanese

Sign for a store that just opened in Mark Swofford's neighborhood in Banqiao, New Taipei City:

Read the rest of this entry »

Comments (30)


Trespassed update, part 2 (suicided)

In the first part of this post, we came across the notion of "bèi zìshā 被自殺" ("be suicided").  Since, for many people, this idea (of somebody being "suicided") is hard to comprehend, I asked several graduate students from the PRC if they could explain how it and the related expressions "bèi tiàolóu 被跳楼" ("was jumped off a building"), "bèi shīzōng 被失蹤" ("be disappeared"), and so forth work.  One of them responded thus:

For these expressions, yes one can say so, but it's not grammatically correct in the "orthodox" language of Mandarin. These expressions are used in a satirical way to accuse the government of héxié 和谐 ("harmonization") of the (ugly) truth being reported. "Tā bèi zìshāle 他被自殺了" ("he has "been suicided") means that, although the official / public report claims that the person died of suicide, the truth is that the "suicide" was faked — someone may have murdered him. So he has to appear as if he committed suicide to cover up the ugly deeds by the government. Ditto for "tā bèi tiàolóule"/ 他被跳樓了 ("he was jumped off a building") — his death has no choice but to appear as "owing to tiàolóu 跳楼" ("jumping off a building"), but we all know that this is not what really happened. 

Read the rest of this entry »

Comments (7)


A:ñi 'ant wodalt

Comments (14)


Birdtalk

As is wont for The New Yorker, this article is long, and it is particularly fascinating, so it is hard to resist quoting many of its more breathtaking revelations:

"How Scientists Started to Decode Birdsong:  Language is said to make us human. What if birds talk, too?"  By Rivka Galchen, The New Yorker (October 14, 2024)

Of course, we've been through the business of animal communication countless times on Language Log, but where this article differs from previous discussions is that it concentrates on content and consciousness rather than vocables and sounds.

On a drizzly day in Grünau im Almtal, Austria, a gaggle of greylag geese shared a peaceful moment on a grassy field near a stream. One goose, named Edes, was preening quietly; others were resting with their beaks pointed tailward, nestled into their feathers. Then a camouflaged speaker that scientists had placed nearby started to play. First came a recorded honk from an unpartnered male goose named Joshua. Edes went on with his preening. Next came a honk that was lower in pitch than the first, with a slight bray. Edes looked up. As the other geese remained tucked in their warm positions, incurious, Edes scanned the field. He had just heard a recorded “distance call” from his life partner, a female goose whom scientists had named Bon Jovi.

Read the rest of this entry »

Comments (16)


Compound intensifier of the week

This is apparently from X in February of 2023, though it can now be found elsewhere:

So is ass an intensifier in "super mario level ass geological formation", or has it just been bleached into a formative for turning a phrase into a modifier?

Read the rest of this entry »

Comments (25)


"Knell in the coffin"

From Will Lockett, "We Are Watching The Death of Tesla", Medium 10/18/2024:

This is why the fact that Musk didn’t detail any safety data at ‘We, Robot’ was a knell in the coffin.

Google's AI Overview explains, ignoring the difference in spelling:

The standard metaphorical phrase is "nail in the coffin", of course.

Read the rest of this entry »

Comments (4)


A Sino-Iranian tale of the donkey's Eurasian trail, part 2

The first part of this virtuoso study of the Afro-Eurasian archeolinguistics of the donkey and its concomitant terms in diverse languages across vast expanses of land from East and North Africa to the heartland of East Asia was described in "A Sino-Iranian tale of the donkey's Eurasian trail" (5/10/24).  This post summarizes the second part of the study, which appears here:

Samira Müller, Milad Abedi, Wolfgang Behr, and Patrick Wertmann, "Following the Donkey’s Trail (Part II): a Linguistic and Archaeological Study on the Introduction of Domestic Donkeys to China", International Journal of Eurasian Linguistics, 6 (2) (October 16, 2024), 294-358.

The first two paragraphs of the Abstract were reproduced in the Language Log post cited in the first paragraph above, so there is no need to repeat them here.  Here is the third paragraph of the Abstract, which appears at the head of the just published Part II of the article:

Read the rest of this entry »

Comments (8)


Bizarre English-Japanese language confusion

"Australian man claiming language mix-up jailed over Tokyo break-in", By Himari Semans, The Japan Times (10/18/24)

The behavior of the defendant was so peculiar that, even if he was not intending to rob or injure the old man into whose house he broke, he deserves the 240 days detention plus 490 days for a total of two years jail time to which he was sentenced today.

The report repeatedly mentioned that the defendant smelled "gasoline".  I wonder if what he was really trying to say was that he smelled "gas".

"gasu ガス" ("gas")

"gasorin ガソリン" ("gasoline")

Read the rest of this entry »

Comments (3)


Frazz on lexical drift

For the past week or so, Jef Mallett's Frazz has been exploring etymology and semantic drift.

The current sequence starts on 10/10 (or maybe earlier):

Read the rest of this entry »

Comments (3)