August 30, 2025 @ 6:04 pm
· Filed by Victor Mair under Artificial intelligence, Cognitive science, Redundancy, Syntax
"Subword Symmetry in Natural Languages." Pelloni, Olga et al. Royal Society Open Science 12 (August 21, 2025).
Abstract
Symmetric patterns are found in the orderly arrangements of natural structures, from proteins to the symmetry in animals’ bodies. Symmetric structures are more stable and easier to describe and compress, which is why they may have been preferred as building blocks in natural selection. The idea that natural languages undergo an evolutionary process akin to the evolution of species has been pervasive in the study of language. This process might result in symmetric patterns as in other natural structures, but the notion of symmetry is rarely associated with the study of natural language. In this study, we look for symmetric patterns in text data, considering the length of subword units under a range of possible subword analyses. We study the length of subword units in 32 languages and discover that the splits of long words tend to be symmetric regardless of the segmentation method and that some automatic methods give symmetric splits at all word lengths. These results include natural language in the set of phenomena that can be described in terms of symmetry, opening a new research avenue for the empirical study of text data as a structure comparable to various other structures in the natural world.
Read the rest of this entry »
Permalink
August 30, 2025 @ 7:23 am
· Filed by Mark Liberman under Artificial intelligence
Back in the 1940s, Stanislaw Ulam and John von Neumann came up with the idea of "Cellular automata", which started with models of crystal growth and self-replicating systems, and continued over the decades with explorations in many areas, popularized in the 1970s by Conway's Game of Life. One strand of these explorations became known as Agent-based Models, applied to problems in ecology, sociology, and economics. One especially influential result was Robert Axelrod's work in the mid-1980s on the Evolution of Cooperation. For a broader survey, see De Marchi and Page, "Agent-based models", Annual Review of Political Science, 2014.
Read the rest of this entry »
Permalink
August 30, 2025 @ 12:34 am
· Filed by Victor Mair under Language and politics, Signs

A road sign at the Gwadar Free Zone, operated by China Overseas Ports
Holding Company, in Gwadar, Balochistan, Pakistan. This port is a crucial part
of the China-Pakistan Economic Corridor. (Photograph dated July 4, 2018)
Read the rest of this entry »
Permalink
August 29, 2025 @ 7:54 am
· Filed by Victor Mair under Grammar, Morphology
If you're in museum administration, you will certainly know the meaning of "restitution". But what do you do with a headline like this?
"Ethiopian Heritage Authority Intensifies Push to Restitute Looted Artifacts." ENA English.
Ted McClure asks:
Back-formation from "restitution"? Or verb origin of "restitution"? I would have thought the verb form was "restore".
Read the rest of this entry »
Permalink
August 28, 2025 @ 8:53 am
· Filed by Mark Liberman under Language and culture
Email from J.P.:
I don't know if it's my imagination, but I hear — "spends his/her/their time on" — SO much lately, and seemingly increasingly, it's used in a derogatory or critical way, as if to say that to spend the time in this/whatever way is stupid.
It is annoying me greatly, so I turn to Language Log, wondering if it is actually highly on the rise or if I am selectively attending.
Read the rest of this entry »
Permalink
August 27, 2025 @ 1:53 pm
· Filed by Victor Mair under Borrowing, Etymology, Language and science
This word caught my attention on the news this morning. It was said to be a gigantic dust/sandstorm that was passing through the central Arizona area. As soon as I heard the sound of the word, with a probable triliteral Semitic root and the fact that it was some sort of sandstorm, I thought that it was most likely Arabic. And indeed it is.
Read the rest of this entry »
Permalink
August 27, 2025 @ 8:14 am
· Filed by Mark Liberman under Prosody
In "Reading Instruction in the mid 19th century" (8/16/2025), I underlined the old-fashioned focus on "elocution", in which readers were trained "to convey to the hearer, fully and clearly, the ideas and feelings of the writer". Much of today's reading instruction turns that into measures of "oral reading fluency", measured as words correct per minute ("wcpm"). This can result in high-scoring readers like those described in this passage from the Introductory Remarks in 1844 edition of McGuffey's Rhetorical Guide, which warns against the consequences of failure to teach "elocution" from the very start:
Read the rest of this entry »
Permalink
August 27, 2025 @ 5:48 am
· Filed by Mark Liberman under Language and culture
Adam Aleksic, "The insidious creep of Trump's speaking style", NYT 8/17/2025:
“Many such cases.” “Many people are saying this.”
You may recognize these phrases as “Trumpisms” — linguistic coinages of President Trump — but they’ve also become ingrained in our collective vocabulary. Since they became popular as memes during his first presidential campaign, we have begun using them, first sardonically, and then out of habit.
If you search for “many such cases” on X, you’ll see new posts of the phrase seemingly every minute, primarily applied to nonpolitical contexts like work anxiety or the real estate market. Google Trends shows both expressions increasing in usage since the mid-2010s.
Read the rest of this entry »
Permalink
August 26, 2025 @ 7:18 pm
· Filed by Victor Mair under Grammar, Morphology
[This is a guest post by Mok Ling]
Someone asked me why shìhé 適合 ("to suit") and héshì 合適 ("suitable") aren't exactly reversible. [VHM added the romanizations and parenthetical definitions for those who do not know sinographs. Ibid. below.] A quick search online got me this explanation:
"They [適合 and 合適] mean more or less the same thing, but the former is a verb, while the latter is an adjective." (Chinese Grammar Wiki)
I could not figure out why this is the difference they find. Both the Wiktionary and Baidu entries for 適合 give 合適 as a synonym and vice versa.
Giles' Chinese-English Dictionary has neither word, but does have héshì 合式 ("suitable") under both 合 (3947) and 式 9948. The spelling with 式 is also considered a variant form by DeFrancis.
Read the rest of this entry »
Permalink
August 26, 2025 @ 12:27 pm
· Filed by Victor Mair under Intelligibility, Language and technology
Thanks to the productive, enlightening discussion we had in the first part of this post, I could not help but think of "speed" as a category of modern life. That led me to remember a book buried in my dungeon (downstairs study) that I had read about a quarter of a century ago. It wasn't anything like William S. Burroughs Speed. It was more on the order of a history of science work.
So I descended the stairs to my basement library. It wasn't long before I found it:
Faster: The Acceleration of Just About Everything by James Gleick
-
- Topic: This popular book explores the modern, tech-driven obsession with speed and how it affects nearly every aspect of life, from our work habits and communication to our personal time.
- Summary: Gleick discusses the "hurry sickness" of modern life and the paradox that even with time-saving devices, we feel more rushed than ever.
Read the rest of this entry »
Permalink
August 25, 2025 @ 4:37 am
· Filed by Victor Mair under Grammar, Translation
"Arabic Translations of the English Adjective 'Necessary': A Corpus-Driven Lexical Study." Alhedayani, Rukayah et al. Humanities and Social Sciences Communications 12, no. 1 (August 18, 2025): 1345.
Abstract
Modal adjectives of non-epistemic necessity are very common in language corpora. However, such adjectives are expected to behave differently in context, and thus differences between them should be highlighted in dictionaries. Nevertheless, there are a few studies that have examined modal adjectives with respect to their associated constructions and meanings in English. More importantly, studies on equivalent Arabic modal adjectives are scarce. Hence, the present study is quantitative and corpus-driven utilizing monolingual (i.e., the arTenTen18 and the enTenTen18) and parallel (i.e., Open Parallel Corpus or OPUS for short) corpora. Further, it is based on construction grammar and frame semantics to explore Arabic and English words of necessity.
Read the rest of this entry »
Permalink
August 24, 2025 @ 8:41 am
· Filed by Mark Liberman under Artificial intelligence
In "Chain of thought hallucination?" (8/8/2025), I illustrated some of the weird text representations that GPT-5 creates when its response is an image rather than a text string. I now have its recommendation for avoiding such problems — which sometimes works, so you can try it…
Read the rest of this entry »
Permalink
August 23, 2025 @ 1:43 pm
· Filed by Victor Mair under Etymology, Language and business, Signs, Usage
There's a big fuss and furor over the logo change at Cracker Barrel:
Read the rest of this entry »
Permalink