Alan Turing's revenge?

Ilia Shumailov et al., "The Curse of Recursion: Training on Generated Data Makes Models Forget", 5/31/2023:

What will happen to GPT-{n} once LLMs contribute much of the language found online? We find that use of model-generated content in training causes irreversible defects in the resulting models, where tails of the original content distribution disappear. We refer to this effect as Model Collapse and show that it can occur in Variational Autoencoders, Gaussian Mixture Models and LLMs.

Read the rest of this entry »

Comments (14)


It's impossible to detect LLM-created text

Last year, I expressed considerable skepticism about the prospects for accurate detection of text generated by Large Language Models ("Detecting LLM-created essays?", 12/20/2022). Since then, many new systems claiming to detect LLM outputs have emerged, notably Turnitin's "AI writing detector".

In a recent post on AI Weirdness ("Don't use AI detectors for anything important", 6/30/2023), Janelle Shane presents multiple examples of multiple kinds of failure, and explains why things are not likely to change.

Read the rest of this entry »

Comments (3)


My garden path of the day

"Alligator Kills 69-Year-Old Woman in South Carolina", NYT 7/4/2023:

A 69-year-old woman was attacked and killed by an alligator on Tuesday as she was walking her dog in her neighborhood in Hilton Head Island, S.C., the authorities said.

The Beaufort County Sheriff’s Office said it was the second fatal alligator attack in the county in less than a year. […]

Jay Butfiloski, the furbearer and alligator program coordinator with the state’s Natural Resources Department, could not be reached on Tuesday.

Read the rest of this entry »

Comments (7)


"Communism" in Korean

As I have demonstrated here, communism is still very much a thing in North Korea, and apparently under the leadership of Kim Jung Un increasingly more so.

Now, the word for "communism" in the Korean of South Korea is gongsanjuui 공산주의 (共産主義), which simply adopts the Chinese gòngchǎn zhǔyì 共産主義. Since that usage goes against the regime's general principle of replacing words from Chinese characters with native morphemes, it caused me to wonder what the word for "communism" must be in the Korean of North Korea, inasmuch as gongsanjuui 공산주의 (共産主義) is a wholly Sino-Korean term.

Read the rest of this entry »

Comments (5)


Transitive "blink"

Reader Scott Mauldin asks:

I am curious about a unique usage I read in SCOTUS Justice Ketanji Jackson's dissent to the recent cases on affirmative action. She says  “This contention blinks both history and reality in ways too numerous to count.” To me, the usage of "blink" as an transitive verb to mean [I assume] something like "ignore" was completely novel. To see what to me is a nonstandard usage show up in a Supreme Court dissent was strange. Is this common usage in some communities, and if so would you or your readers happen to have information on that usage?

Read the rest of this entry »

Comments (37)


In North Korea, it's a dire crime to speak like a South Korean, part 2

This is a language war that has been going on for years, and there will never be an end to it, so long as there is a communist North Korea and a democratic South Korea.  It is as deadly as a shooting war, because people die for using the language of the enemy.  I'm not talking about the content of their speech, but rather its very nature.

North Koreans face execution for using South Korean idioms

The Times (6/30/23)

How does this work out in practice?

North Koreans who use the “obsequious” accent and expressions of South Korea face execution under a harsh new law aimed at eliminating South Korea's growing influence on the language used by its communist neighbour.

Read the rest of this entry »

Comments (13)


Xi Jinping's faux classicism

This new article in The Economist (6/29/23) has a familiar ring to it:

To understand Xi Jinping, it helps to be steeped in the classics

China’s leader has invented a phrase—and an image

Take four Chinese characters, all of them in everyday use. Put them in a certain order and, lo, they become a phrase that looks like classical Chinese—the kind of language used by the literati of yore. The idea they convey could be expressed just as succinctly in colloquial Chinese, but the classical style has gravitas. And it is a phrase loved by Xi Jinping, China’s leader, so all must follow suit.

More than any of his predecessors, Mr Xi likes to spice up his speeches with quotations from classical literature, especially poetry and philosophy. It fits one of his stated missions: instilling “cultural self-confidence” (alongside confidence in the political system). And it helps to buff up his image. In Chinese history, rulers were expected to be erudite. Two volumes have been published providing explanations of Mr Xi’s classical aphorisms.

Read the rest of this entry »

Comments (17)


Antakshari recitation in India

This is part of a long series of Language Log posts in which we pondered the phenomenal memorization skills of persons of Indian heritage (see "Selected readings" below).

So you know what's happening in the following astonishing video, let me begin by giving a basic definition, etymology, and explication of what happens in this intricate word game:

Antakshari, also known as Antyakshari (अंताक्षरी transl. The game of the ending letter) is a spoken parlor game played in India. Each contestant sings the first verse of a song (often Classical Hindustani or Bollywood songs) that begins with the consonant of Hindi alphabet on which the previous contestant's song ended.

The word is derived from two Sanskrit words: antya (अन्त्य) meaning end + akshara (अक्षर) meaning letter of the alphabet. When these words are combined and an '-i' suffixed, the term means "The game of the ending letter". Due to schwa syncope in Hindi and other Indo-Aryan languages, Antyakshari is pronounced antakshri. A dialectical variation of the word is इन्ताक्षरी or intakshri.

Read the rest of this entry »

Comments (10)


The spiny terminological conundrum of ekhidna and ekhinos

[This is a guest post by Stewart Nicol]

Greek particles

I am a zoologist and comparative physiologist who has worked extensively on the monotremes, the platypus and the echidna. I have been putting together some notes on the naming of the these animals. After originally being placed in the genus Myrmecophaga with the other, totally unrelated, anteaters, the echidna was given the specific name Myrmecophaga aculeata (prickly anteater) by George Shaw in 1792.  It was named Echidna histrix by Georges Cuvier, misspelling Hystrix (Greek for porcupine). In 1811 Johann Illiger published an overhaul of the Linnaean system and replaced Cuvier’s genus name Echidna with Tachylossus (fast tongue) making the full binomial Tachyglossus aculeatus. The Genus name Echidna would have had priority but it had previously been applied to a genus of Moray eels, so the echidna became Tachyglossus aculeatus, but popularly known as the echidnaCuvier doesn’t say why he used the name echidna, but the general assumption is that it alludes to a monster in Greek mythology , ἔχιδνα or ekhidna, half woman (mammal) and half snake (reptile), because the echidna was believed to combine characteristics of reptiles and mammals. Unfortunately, the word ekhidna is very similar to the ekhinos (ἐχῖνος) which is the Ancient Greek word for hedgehog, and appears in the names echinoderm and echinacea because they have spines, giving rise to the misapprehension that the name echidna means spiny.

Read the rest of this entry »

Comments (3)


The AI threat: keep calm and carry on

Three weekends ago, I delivered a keynote here:

New Directions in Chinese Language Education in the 21st Century

The Eighth International Conference on Teaching Chinese as a Second Language

Swarthmore College, June 9-10, 2023

———–

Abbreviations:

    AI — Artificial Intelligence

    DT — Digital Technology

    IT — Information Technology

    DH — Digital Humanities

    AGI — Artificial General Intelligence, where machines supposedly can accomplish any intellectual task that a human can (to me that's a pipe dream)

(given for present and future reference and use)

Title "Aspects of AI and digital technologies in Chinese language teaching"

Abstract

In recent decades, language processing hardware and software have progressed at an astonishing rate, one that is geometric rather than arithmetic.  The opportunities these advances offer and the challenges they pose require our thoughtful attention and careful response, lest the machines get out of control and affect our students in detrimental ways.  DeepL, ChatGPT, and other constantly evolving technologies possess enormous power to manipulate language, power that we can utilize for the enhancement of Chinese language pedagogy.  On the other hand, we must monitor and adapt this potential in such a manner that it fits our purposes and meets the needs of our students. 

Read the rest of this entry »

Comments (3)


Cooperative creation with Generative AI

A couple of weeks ago, John Hansen tried "an experiment to see if I could successfully combine random and seemingly unconnected topics into one poem", and reported the results on Medium. This experiment was quickly reproduced by Adrian CDTPPW, Block Wife, and Robert G. Longpré.

Read the rest of this entry »

Comments (1)


Flash sale

Ben Zimmer spotted this interesting street sign in the New York Times photo essay, "DMs from New York City" (June 26, 2023).

Read the rest of this entry »

Comments (11)


Today I learned a new word

The new-to-me word: assembloid.

It occurred in the second (of 20!) bullet points that the blurb for a new publication, Brain Organoid & Systems Neuroscience Journal, lists under the heading

Specific areas of interest include, but are not limited to:

  • Brain organogenesis and Neuronal cultures
  • Methods for generating brain assembloids

Read the rest of this entry »

Comments (8)