Language Log

Latent trees

September 19, 2025 @ 7:06 am · Filed by Mark Liberman under Computational linguistics

There's been some buzz recently about how syntactic structures are implicit in Large Language Models — most recently, the Liu et al. paper noted yesterday by Victor, and an accepted ms by Futrell and Mahowald at Behavioral and Brain Sciences, "How Linguistics Learned to Stop Worrying and Love the Language Models". Futrell and Mahowald recognize something that Liu et al. mostly ignore, namely that constituent structure is obviously implicit in statistical patterns of sequential data, at least if the sequences were generated by a constituency-sensitive process — and that algorithms taking advantage of that fact have been Out There for 70 years or more.

Read the rest of this entry »

Permalink Comments (2)

LLMs and tree-structuring

September 18, 2025 @ 5:59 pm · Filed by Victor Mair under Announcements, Artificial intelligence, Cognitive science, Computational linguistics

"Active Use of Latent Tree-Structured Sentence Representation in Humans and Large Language Models." Liu, Wei et al. Nature Human Behaviour (September 10, 2025).

Abstract

Understanding how sentences are represented in the human brain, as well as in large language models (LLMs), poses a substantial challenge for cognitive science. Here we develop a one-shot learning task to investigate whether humans and LLMs encode tree-structured constituents within sentences. Participants (total N = 372, native Chinese or English speakers, and bilingual in Chinese and English) and LLMs (for example, ChatGPT) were asked to infer which words should be deleted from a sentence. Both groups tend to delete constituents, instead of non-constituent word strings, following rules specific to Chinese and English, respectively. The results cannot be explained by models that rely only on word properties and word positions. Crucially, based on word strings deleted by either humans or LLMs, the underlying constituency tree structure can be successfully reconstructed. Altogether, these results demonstrate that latent tree-structured sentence representations emerge in both humans and LLMs.

Read the rest of this entry »

Permalink Comments (7)

Fun with Q&A homonyms

September 18, 2025 @ 9:52 am · Filed by Mark Liberman under Humor

The most famous example, of course, is the 1945 "Who's on first?" dialogue:

Read the rest of this entry »

Permalink Comments (20)

"China" vs. "My / Our Country"

September 17, 2025 @ 5:34 pm · Filed by Victor Mair under Names, Politics of language, Translation

Mark Metcalf wrote:

Currently working my way through an excellent book on Jūnshì lúnlǐ wénhuà 军事伦理文化 (The culture of military ethics) and started noticing that the author ping-pongs between Zhōngguó 中国 and wǒguó 我国 when discussing various aspects of the PRC's history and alleged achievements. Are you aware of any general guidance regarding how the decision is made to use one term or the other? Topical? Polical? Tone? I'll keep digging and let you know if anything jumps out at me.

BTW, one UVA colleague described how he had to teach first year PRC students that "my country" was not an acceptable synonym for China when writing literature essays.

Read the rest of this entry »

Permalink Comments (19)

Paper made from 100% recycled stone?

September 17, 2025 @ 6:20 am · Filed by Mark Liberman under Language and the law

I'm spending a couple of days at BioTechX2025, which is in Philadelphia this year. And one of the exhibitors is giving away Karst Stone Paper Notebooks, which have a wrapper telling us that

Read the rest of this entry »

Permalink Comments (17)

Fowler's three-colored flag?

September 16, 2025 @ 4:05 pm · Filed by Mark Liberman under Language and politics

Liam Julian, "Putting Fowler back in Fowler's" (Hoover Institution, 2009) presents a perspective that used to be more common that it is today, I think: linguistic prescriptivism as (a particular kind of) cultural conservatism, in explicit association with right-wing politics. Julian wrote:

Burchfield, in his preface to Fowler’s third edition, called the first edition “this extraordinary book, the Bible of presciptivists.” But in the early 20th century, when Fowler was writing the extraordinary book, the trend was away from prescriptivism and toward a descriptive, academic linguistics that, like Burchfield himself, observed rather than decreed.1 Burchfield stressed the extent of “the isolation of Fowler from the mainstream of the linguistic scholarship of his day” and highlighted “his heavy dependence” on English school textbooks and the classics of ancient Greece and Rome, the Renaissance, and post-Renaissance English literature. For Fowler, Burchfield wrote, these influences composed “a three-colored flag” that “was to be saluted and revered, and, as far as possible, everything it represented was to be preserved intact.”

Read the rest of this entry »

Permalink Comments (10)

Revenge on English

September 16, 2025 @ 3:50 pm · Filed by Victor Mair under Humor, Pronunciation

View this post on Instagram

A post shared by Caitanya Tan | Actress • Host • Storyteller (@caitofalltraits)

A skit by Singaporean voice actress Caitanya Tan.

Read the rest of this entry »

Permalink Comments (9)

Spanakopita: a spinach footnote

September 16, 2025 @ 6:31 am · Filed by Victor Mair under Language and biology, Language and food

We have had so many posts dedicated to Popeye's favorite vegetable (see "Selected readings" below), but we haven't yet done justice to one of my favorite spinach dishes: spanakopita.

Spanakopita (/ˌspænəˈkɒpɪtə, ˌspɑː-, –ˈkoʊ-/; Greek: σπανακόπιτα, from σπανάκι spanáki 'spinach', and πίτα píta 'pie') is a Greek savory spinach pie. It often also contains cheese, typically feta, and may then be called spanakotiropita (Greek: σπανακοτυρόπιτα "spinach-cheese pie"), especially in northern Greece.^{[citation needed]} In southern Greece, the term spanakopita is also common for the versions with cheese.

("Savory spinach pie")

Read the rest of this entry »

Permalink Comments (7)

Tropical Storm Gabrielle Spaghetti

September 15, 2025 @ 12:51 pm · Filed by Mark Liberman under Humor

From J.M.:

Much amusement online this morning about a tropical storm that is named Gabrielle Spaghetti and is apparently doing some modeling work.

Read the rest of this entry »

Permalink Comments (11)

More of GPT-5's absurd image labelling

September 15, 2025 @ 6:02 am · Filed by Mark Liberman under Artificial intelligence

GPT-5 is impressively good at some things (see "No X is better than Y", 8/14/2025, or "GPT-5 can parse headlines!", 9/7/2025), but shockingly bad at others. And I'm not talking about "hallucinations", which is a term used for plausible but false facts or references — such mistakes remain a problem, but every answer is not a hallucination. Adding labels to images that it creates, on the other hand, remains reliably and absurdly bad.