What's (still) wrong with text-to-speech?

Text-To-Speech technology has improved enormously over the decades — but there's still some headroom, as a friend has recently underlined for me. He observes that when The Economist magazine first publishes a piece online, it appears with a AI-read audio, and then later with a human-read version:

The rhythm/prosody/pitch (I'm not exactly sure which – all three?) is the same in nearly every sentence and even clause. This high-then-falling pattern is fine in one sentence, but repeated 50 times in a row is awful.

Later, those pieces that make it into the print edition get their own, human-read version. So voilà, you have a perfect before-and-after.

Read the rest of this entry »

Comments (8)


The Chinese Computer: Competition or Cooperation?

The Chinese Computer: Competition or Cooperation?
book review by
David Moser
Beijing Capital Normal University

Thomas Mullaney’s The Chinese Computer is a fascinating account of the decades-long effort by linguists, computer scientists and engineers to incorporate Chinese characters into the digital age. Drawing on a vast body of historical and scientific sources, the book offers the reader an lively account of the formidable technical challenges involved in creating practical and intuitive input methods for one of the world’s most complex writing systems. The reader will come away with an increased awareness of the contributions that Chinese computing brought to modern computer science.

Chinese scholars and sinologists working in the 1980s and 90s will recall the early generations of Chinese word processors—slow, unreliable, and crash-prone—when every incremental gain in speed or compatibility felt like a small miracle. Thanks to the ingenuity and innovation of computer input developers, today anyone on the planet can create Chinese texts using an impressive ecosystem of powerful and user-friendly tools.

Read the rest of this entry »

Comments (3)


Planes, patches, pilots, and propaganda

Air Force billboard in Shijiazhuang, Hebei Province, China:


Courtesy of The Great Translation Movement (TGTM) — here.

Read the rest of this entry »

Comments (6)


Hanging a trans flag from El Capitan

François Lang says:

This WSJ headline garden-pathed me; I got the correct parse only on the third try!

Federal Worker Fired After Hanging Trans Flag at Yosemite Sues Government

Former Park Service employee claims free speech violations after organizing climbers for display at ‘El Cap’

By Allison Pohle, WSJ (2/23/26)

Read the rest of this entry »

Comments (15)


Crazy characters

Taken outside a hotel in Shenzhen:

Read the rest of this entry »

Comments off


Unifying Arabic topolects through AI

Meet Habibi – the Chinese AI uniting 20 Arabic dialects in a Middle East first
Lead author says there are many differences between Arabic dialects and Modern Standard Arabic, which is used in official circumstances
Zhao Ziwen, SCMP, 28 Feb 2026

The paper that presents this new model is called “Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis”. It was published last month on arXiv, an open-access repository that is not peer-reviewed.  I will be interested to hear what Language Log readers think of its prospects.

Read the rest of this entry »

Comments (4)


Tariffs

With all the recent news about tariffs, I wondered where the word came from. So I consulted the OED:

< Italian tariffa ‘arithmetike or casting of accounts’ (Florio), ‘a book of rates for duties’ (Baretti), = Spanish tarifa, Portuguese tarifa, < Arabic taʿrīf notification, explanation, definition, article, < ʿarafa in 1st conj. to notify, make known. So French tarif.

Read the rest of this entry »

Comments (17)


Washington State Spanish

"Callers to Washington state hotline press 2 for Spanish and get accented AI English instead", AP News 2/27/2026:

For months, callers to the Washington state Department of Licensing who have requested automated service in Spanish have instead heard an AI voice speaking English in a strong Spanish accent.

A recording:

Comments (16)


Spacing in Korean

The role of a Scotsman, John Ross (1842-1915), in creating it.  Although he was a Christian missionary who spent over half his life in China, he was apparently a gigachad.

The following video is densely packed with solid information and moves rapidly, so you have to pay close attention to follow it.

Read the rest of this entry »

Comments (8)


Rampant plagiarism in the Chinese literary world

"It cannot read the human heart" by Yan Ge (b/1984), London Review of Books Blog (2/20/26)

Read the rest of this entry »

Comments (38)


Saving Sámi

"How toddlers in Finland are saving an endangered Sámi language"
by Erika Benke, BBC (5 days ago)

Special nurseries are helping the Sámi people in Finland to bring their almost-lost language back from the brink of extinction.

When I stayed in the Arctic Circle to finish writing The True History of Tea with Erling Hoh, I was amazed by the symbiotic relationship the Sámi there had with their vast herds of reindeer.   And, yes, they do ride them, which someone was asking about here recently.

Read the rest of this entry »

Comments (3)


The full name of Bangkok

@kattoksthai

Replying to @Mamba Did you know that Bangkok has the longest city name in the world? I dare you to say it too! #bangkok #thailand #thai

♬ original sound – Kat Talks Thai

Read the rest of this entry »

Comments (27)


"Written Cantonese must have word segmentation"

That's the title of an essay that appeared in my e-mail today from an outfit called Cantonese Script Reform 粵字改革.  Here's what they say:

Written Cantonese must have spaces, like Korean. The calligraphic issue must give way. For the space itself is a grammatical marker that marks the beginning and the end of a word. This tool of demarcation will allow poet and playwright to invent new words by putting words together within the confinements delineated by the spaces between words. Written Cantonese needs all the tools imaginable for it to revitalise and resurrect its lost vocabulary. A Hebrew-esque recycling off ancient words for purposes anew is the way to go. But we can’t do that if we can’t tell if this is a new word because we can’t tell if these characters familiar so and so sequenced are merely a fanciful poetic playful arrangement or other mark of the invention of a new word, where a familiar noun is turned into a verb or verb is turned into an adjective or an adjective is now henceforth interpreted as a noun in this particular context.

Read the rest of this entry »

Comments (33)