Archive for July, 2024

Deutsche Zungenbrecher

"Some German tongue-twisters", posted on 21/07/2024 by StephenJones.blog   

Whereas the mind-boggling “tapeworm words” in my post on Some German mouthfuls are of a practical nature, the realm of fantasy opens up whole new linguistic vistas. In a stimulating article, Deborah Cole introduces the work of the Berlin-based cabaret performer, playwright, and pianist Bodo Wartke.

Read the rest of this entry »

Comments (3)

Reading Old Turkic runiform inscriptions with the aid of 3D simulation

"Augmenting parametric data synthesis with 3D simulation for OCR on Old Turkic runiform inscriptions: A case study of the Kül Tegin inscription", Mehmet Oğuz Derin and Erdem Uçar, Journal of Old Turkic Studies (7/21/24)

Abstract

Optical character recognition for historical scripts like Old Turkic runiform script poses significant challenges due to the need for abundant annotated data and varying writing styles, materials, and degradations. The paper proposes a novel data synthesis pipeline that augments parametric generation with 3D rendering to build realistic and diverse training data for Old Turkic runiform script grapheme classification. Our approach synthesizes distance field variations of graphemes, applies parametric randomization, and renders them in simulated 3D scenes with varying textures, lighting, and environments. We train a Vision Transformer model on the synthesized data and evaluate its performance on the Kül Tegin inscription photographs. Experimental results demonstrate the effectiveness of our approach, with the model achieving high accuracy without seeing any real-world data during training. We finally discuss avenues for future research. Our work provides a promising direction to overcome data scarcity in Old Turkic runiform script.

Read the rest of this entry »

Comments (1)

Government dampers on AI in the PRC, part 2

"China deploys censors to create socialist AI:  Large language models are being tested by officials to ensure their systems ‘embody core socialist values’", by Ryan McMorrow and Tina Hu in Beijing, Financial Times (July 17 2024)

Chinese government officials are testing artificial intelligence companies’ large language models to ensure their systems “embody core socialist values”, in the latest expansion of the country’s censorship regime.

The Cyberspace Administration of China (CAC), a powerful internet overseer, has forced large tech companies and AI start-ups including ByteDance, Alibaba, Moonshot and 01.AI to take part in a mandatory government review of their AI models, according to multiple people involved in the process.

The effort involves batch-testing an LLM’s responses to a litany of questions, according to those with knowledge of the process, with many of them related to China’s political sensitivities and its President Xi Jinping.

Read the rest of this entry »

Comments (9)

New horizons in word sense analysis

Today's xkcd:

Mouseover title: IMO the thymus is one of the coolest organs and we should really use it in metaphors more."

Read the rest of this entry »

Comments (17)

Topolect: a Four-Body Problem

From Jeff DeMarco:

The fanfic fourth book in the sāntǐ 三体 ("three-body [problem]") series, translated by Ken Liu has the following sentence:

Women dressed in flowing silk dresses oared elegant barges over the placid waterways, singing folk ditties in the gentle, refined accents of the Wu topolect …

Read the rest of this entry »

Comments (23)

Little Italian girl talking with her hands

Comments (9)

China VPN redux

Chapter 1

A professor in China who is collaborating with a famous American professor of Chinese literature wanted to read one of my Language Log (LL) posts because he had heard that it's being widely discussed around the world.  However, because of China's rigid censorship rules, he couldn't open the LL post.

The Chinese professor asked the American professor to help him gain access to my post.

The American professor asked me to help the Chinese professor.

I suggested to the Chinese professor to use a VPN.  Without a VPN, Chinese are not able to access LL, Wikipedia, Wiktionary, Google, X, etc., etc.  In other words, without a VPN, Chinese are cut off from most of the information on the internet that is outside the Great Firewall, i.e., most of the cutting edge, valuable information in the world.

The Catch 22 is that it is a crime to use a VPN in China.

Can you imagine having to live in a benighted place like the PRC?

Read the rest of this entry »

Comments (5)

No "good morning" and "good afternoon" in Romance Languages?

From François Lang:

I hope this isn't a well-known question. I searched LL for
"good morning" romance
and found nothing. So here goes.
 
(1) One can say "good evening" idiomatically in Romance languages, but not "good morning" or "good afternoon".
(2) However, all three are idiomatic in Germanic languages. 
 
I'm wondering if LL readers concur, and, if so, have any explanations of these two points.

Read the rest of this entry »

Comments (61)

Government dampers on AI in the PRC

"China Puts Power of State Behind AI—and Risks Strangling It:  Government support helps China’s generative AI companies gain ground on U.S. competitors, but political controls threaten to weigh them down", by Lia Lin, WSJ (7/16/24)

Most generative AI models in China need to obtain the approval of the Cyberspace Administration of China before being released to the public. The internet regulator requires companies to prepare between 20,000 and 70,000 questions designed to test whether the models produce safe answers, according to people familiar with the matter. Companies must also submit a data set of 5,000 to 10,000 questions that the model will decline to answer, roughly half of which relate to political ideology and criticism of the Communist Party.

Generative AI operators have to halt services to users who ask improper questions three consecutive times or five times total in a single day.

Read the rest of this entry »

Comments (6)

The true identity of the first Chinese translator of Lady Chatterley's Lover

There has long been a suspicion that the first Chinese translator of Lady Chatterley's Lover (1928/1932), Ráo Shùyī 饒述一, about whom next to nothing is known, was actually the scholar and theoretician of aesthetics, Zhū Guāngqián 朱光潛 (1897-1986).

To give a little bit of background about the nature of the two translations of the novel, here is the abstract of a recent scholarly article comparing them:

This article discusses how sex-related content is rendered in two Chinese translations of D. H. Lawrence's Lady Chatterley's Lover: Rao Shuyi (1936) and Zhao Susu (2004). It is found that Rao's translation features explicitness, flexibility and Europeanization, while Zhao's translation features conservativeness and domestication. And the observed features in the two translations regarding sex-related content are explained from perspectives of social and historical background, translation purpose and intended readership, and patronage. Index Terms–Lady Chatterley's Lover, translation, sexuality

Zhu, Kun. "The Translation of Sex-related Content in Lady Chatterley's Lover in China." Theory and Practice in Language Studies, vol. 10, no. 8, Aug. 2020, pp. 933+. Gale Literature Resource Center.

Read the rest of this entry »

Comments (1)

"Fisherman Croc's desert song"?

Shannon McDonagh, "'What the Hell Is This?': Crocodile-Like Fossil Rewrites Triassic History", Newsweek 7/11/2024:

The groundbreaking discovery of the Benggwigwishingasuchus eremicarminis reveals the presence of waterside crocodile-like creatures around the globe during the Middle Triassic.

Broadly known as pseudosuchian archosaurs—four-legged, carnivorous beings with an armadillo-like coating—these creatures are now known to have existed coastally between 247.2 million and 237 million years ago.

Read the rest of this entry »

Comments (7)

Japanese expressions for some paranormal phenomena

Japan Subculture Research Center.  A guide to the Japanese underworld, Japanese pop-culture, yakuza and everything dark under the sun.  Telepathy (以心伝心) and Other Coincidences (奇遇)
By jakeadelstein (Jul 10, 2024)

A generous helping of creepiness from Japan.  Here goes:

I was writing to a former intern at Japan Subculture Research Center, Fresca, and asked her to send me her thesis to read—just as she mailed me. I think I was two seconds ahead of her. It was a remarkable coincidence or maybe telepathy. Which got me interested in the many words for the complementary subjects in Japanese. So for your entertainment—here you are.

Read the rest of this entry »

Comments off

IRL reverse dictionary

… or maybe I should say "associative memory"? Or whatever we should call the emerging modes of interaction with Meta Ray-Bans? Anyhow, here's a recently re-published Girls With Slingshots comic (original in 2008):

Read the rest of this entry »

Comments (16)