July 15, 2025 @ 2:57 pm
· Filed by Mark Liberman under Words words words
In today's email there was a message from AAAI 2026 that included a "Call for the Special Track on AI Alignment""
AAAI-26 is pleased to announce a special track focused on AI Alignment. This track recognizes that as we begin to build more and more capable AI systems, it becomes crucial to ensure that the goals and actions of such systems are aligned with human values. To accomplish this, we need to understand the risks of these systems and research methods to mitigate these risks. The track covers many different aspects of AI Alignment, including but not limited to the following topics:
Read the rest of this entry »
Permalink
July 15, 2025 @ 5:00 am
· Filed by Victor Mair under Borrowing, Etymology, Language and food
I want to thank Jonathan Silk (comment here) for pushing Popeye to further heights and deeper depths in our understanding of his favorite vegetable. We're not "finiched" with spinach yet.
Now it's getting very interesting and confusing (Armenian is creeping in):
palak
English
Etymology
From Hindi पालक (pālak), from Sanskrit पालक्या (pālakyā).
Noun
palak (uncountable)
-
- (India, cooking) Spinach or similar greens (including Amaranthus species and Chenopodium album).
Read the rest of this entry »
Permalink
July 14, 2025 @ 4:20 pm
· Filed by Victor Mair under Dialects, Metaphors, Second language, Semantics, Sociolinguistics
This research investigates the semantic change and conceptual metaphor of the Thai word prèet (/เปรต/), which originates from the Pali-Sanskrit term meaning “departed.” The primary objective is to explore how the term’s meaning has shifted in contemporary Thai society, where it is now used pejoratively to criticize behaviors such as excessive greed, gluttony, immorality, and social deviance. Data for this study are drawn from both historical texts, particularly the Traibhumi Phra Ruang (a prominent Thai Buddhist text from the 14th-century Sukhothai period), and modern Thai linguistic usage. The analysis employs conceptual metaphor theory, focusing on metaphors like SOCIAL DEVIANCE IS MONSTROSITY, MORAL FAILURE IS DEGRADATION, GREED IS HUNGER, and SPIRITUAL LIMINALITY IS MONSTROSITY. to understand how these shifts reflect changing cultural and societal values. Additionally, Impoliteness Theory is applied to examine how prèet functions as a linguistic tool for social critique. Findings show that the semantic evolution of prèet reveals an intricate relationship between language, culture, and metaphor, as it transitions from a religious concept to a vehicle for social commentary. The implications of this study highlight the dynamic nature of language in reflecting societal shifts.
Read the rest of this entry »
Permalink
July 14, 2025 @ 9:58 am
· Filed by Victor Mair under Animal behavior, Animal communication, Bibliography, Emojis and emoticons, Gesture, Words words words
Bibliographical cornucopia for linguists, part 1
Since we have such an abundance of interesting articles for this fortnight, I will divide the collection into two parts, and provide each entry with an abstract or paragraph length quotation.
A fundamental question in word learning is how, given only evidence about what objects a word has previously referred to, children are able to generalize to the correct class. How does a learner end up knowing that “poodle” only picks out a specific subset of dogs rather than the broader class and vice versa? Numerous phenomena have been identified in guiding learner behavior such as the “suspicious coincidence effect” (SCE)—that an increase in the sample size of training objects facilitates more narrow (subordinate) word meanings. While SCE seems to support a class of models based in statistical inference, such rational behavior is, in fact, consistent with a range of algorithmic processes. Notably, the broadness of semantic generalizations is further affected by the temporal manner in which objects are presented—either simultaneously or sequentially. First, I evaluate the experimental evidence on the factors influencing generalization in word learning. A reanalysis of existing data demonstrates that both the number of training objects and their presentation-timing independently affect learning. This independent effect has been obscured by prior literature’s focus on possible interactions between the two. Second, I present a computational model for learning that accounts for both sets of phenomena in a unified way. The Naïve Generalization Model (NGM) offers an explanation of word learning phenomena grounded in category formation. Under the NGM, learning is local and incremental, without the need to perform a global optimization over pre-specified hypotheses. This computational model is tested against human behavior on seven different experimental conditions for word learning, varying over presentation-timing, number, and hierarchical relation between training items. Looking both at qualitative parameter-independent behavior and quantitative parameter-tuned output, these results support the NGM and suggest that rational learning behavior may arise from local, mechanistic processes rather than global statistical inference.
Read the rest of this entry »
Permalink
July 14, 2025 @ 5:20 am
· Filed by Mark Liberman under Linguistics in the comics
Today's SMBC:

Read the rest of this entry »
Permalink
July 13, 2025 @ 7:09 am
· Filed by Mark Liberman under Artificial intelligence
Joel Becker et al., "Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity", METR 7/10/2025:
Despite widespread adoption, the impact of AI tools on software development in the wild remains understudied. We conduct a randomized controlled trial (RCT) to understand how AI tools at the February–June 2025 frontier affect the productivity of experienced open-source developers. 16 developers with moderate AI experience complete 246 tasks in mature projects on which they have an average of 5 years of prior experience. Each task is randomly assigned to allow or disallow usage of early-2025 AI tools. When AI tools are allowed, developers primarily use Cursor Pro, a popular code editor, and Claude 3.5/3.7 Sonnet. Before starting tasks, developers forecast that allowing AI will reduce completion time by 24%. After completing the study, developers estimate that allowing AI reduced completion time by 20%. Surprisingly, we find that allowing AI actually increases completion time by 19%—AI tooling slowed developers down. This slowdown also contradicts predictions from experts in economics (39% shorter) and ML (38% shorter). To understand this result, we collect and evaluate evidence for 20 properties of our setting that a priori could contribute to the observed slowdown effect—for example, the size and quality standards of projects, or prior developer experience with AI tooling. Although the influence of experimental artifacts cannot be entirely ruled out, the robustness of the slowdown effect across our analyses suggests it is unlikely to primarily be a function of our experimental design.
Read the rest of this entry »
Permalink
July 13, 2025 @ 5:49 am
· Filed by Victor Mair under Humor, Multilingualism, Pronunciation, Punctuation, Translation
A learned friend recently sent me a draft composition on medieval Chinese history in which he referred to "*" as an "asterix". This reminded me that ten years ago I wrote a post, "The many pronunciations of '*'" (12/17/15), on this subject and we had a lengthy, vigorous discussion about it.
Given that lately we've been talking a lot about Celts, Galatians, and so on, I think it is appropriate to write another post on Asterix the Gaul, that famous French comic book character, and how he got his name. Also inspired / prompted by Chris Button's latest comment.
I often hear "*" pronounced "asterix" or "asterick", and so on (e.g., "astrisk" [two syllables], esp. in rapid speech). It's hard even for me to pronounce "*" or type the symbol those ways, so ingrained is the pronunciation "as-ter-isk".
Read the rest of this entry »
Permalink
July 12, 2025 @ 11:40 am
· Filed by Mark Liberman under Prosody
In "AI win of the week" I explored the inter-personal dimensions of Rousseau's 1754 contention that "there is neither rhythm nor melody in French music, because the language is not capable of them". In the comments, AntC objected that "But, but. Rousseau wrote an opera, in French, to his own Libretto. audio + full score available on Youtube".
For now, I have only two comments on this. First, trolls are often happy to abandon consistency in the service of pwning their audience. And second, the 1754 edition of Rousseau's screed, published two years after the debut of his opera, goes into considerable detail about how he painfully transferred the musicality of Italian prosody to the composition and performance of a work with French lyrics.
But rather than diving further into Rousseau's argument about the relative musicality of different languages' prosody, the point of today's post is to note its resonance with another mid-18th century prosodic dispute, namely Joshua Steele's refutation of James Burnett's claim that English prosody gives its syllables "nothing better than the music of a drum, in which we perceive no difference except that of louder or softer, according as the instrument is more or less forcibly struck".
Read the rest of this entry »
Permalink
July 11, 2025 @ 5:05 pm
· Filed by Victor Mair under Alphabets, Language and religion, Language teaching and learning
[This is a guest post by Chau Wu]
The Presbyterian Church in Taiwan (TPC) was first planted by British missionaries in Tainan, which later expanded to all southern parts of Taiwan, constituting the present Southern Synod of TPC. The most important pioneer among them was the Scottish missionary Rev. Thomas Barclay who worked in Taiwan-Fu (the present Tainan). He was born in Glasgow, and matriculated at the University of Glasgow. While there, he studied under Sir William Thomson, later Lord Kelvin [according to Wikipedia]. The celebrated Lord Kelvin reminds me of the absolute zero degree in physical chemistry and the electric cable equation as the underpinning of the Transatlantic cable as well as the conduction of electric impulses along nerve fibers.
Read the rest of this entry »
Permalink
July 11, 2025 @ 4:23 pm
· Filed by Victor Mair under Language and art, Language and history, Language and religion
As announced in the title of the first post on this subject, my aim is to understand where the Galatians originated and how / why they migrated to where they were when Apostle Paul wrote his epistle to them. Since I was apparently insufficiently clear about both of those purposes in part 1, in this follow-up post I will provide additional scholarly material. Inasmuch as the identification of the Gauls / Celts and the languages they spoke will be important for several posts about them that I will write in the coming weeks, today's post will necessarily be long and detailed.
Here I will quote from Ben Witherington III, Grace in Galatia: A Commentary on St. Paul’s Letter to the Galatians (Grand Rapids, MI: Wm. B. Eerdmans Publishing Co., 1998), pp. 1-7.
N.B.: Illustration for art historians below.
Read the rest of this entry »
Permalink
July 11, 2025 @ 3:38 pm
· Filed by Mark Liberman under Artificial intelligence
In "Beautiful music and logical warts", I quoted (part of) the trollish conclusion of Rousseau's Lettre sur la Musique Française:
Je crois avoir fait voir qu’il n’y a ni mesure ni mélodie dans la musique française, parce que la langue n’en est pas susceptible ; que le chant français n’est qu’un aboiement continuel, insupportable à toute oreille non prévenue; que l’harmonie en est brute, sans expression, et sentant uniquement son remplissage d'écolier ; que les airs français ne sont point des airs ; que le récitatif français n’est point du récitatif. D’où je conclus que les Français n’ont point de musique et n’en peuvent avoir, ou que, si jamais ils en ont une, ce sera tant pis pour eux.
I believe I have shown that there is neither rhythm nor melody in French music, because the language is not capable of them; that French song is only a continual barking, unbearable to any unbiased ear; that the harmony is crude, without expression, and full of childish padding; that French airs are not airs; that French recitative is not recitative. From which I conclude that the French have no music and never will have any, or that, if ever they have some, it will be a disappointment for them.
Read the rest of this entry »
Permalink
July 11, 2025 @ 7:44 am
· Filed by Mark Liberman under Etymology
In "Rococo" (7/6/2025), I quoted from Charles Carr's 1965 paper "TWO WORDS IN ART HISTORY II. ROCOCO" his evidence that the word rococo began as way of denigrating certain kinds of out-of-fashion ugliness. Jonathan Smith noted in the comments that "baroque itself was first a(n) (disparaging) epithet", and I quoted the OED's endorsement of that idea, though without going into the whole "an irregular pearl is like a wart" background.
But in a parallel 1965 article, "TWO WORDS IN ART HISTORY I. BAROQUE", Charles Carr lays out three etymological theories about baroque, after sparing us "fantastic etymologies to be found in certain eighteenth-century dictionaries".
Read the rest of this entry »
Permalink
July 10, 2025 @ 6:34 pm
· Filed by Victor Mair under Borrowing, Etymology, Language and biology, Language and food
[This is a guest post from Christopher Atwood]
Building on observations of Andras Rona-Tas (Tibeto-Mongolica, pp. 213-14), one can observe a basic division in Mongolian words for cultivated plants. They divide into two types: 1) words for grains and grain cultivation; and 2) words for fruits and vegetables.
Words in the first category (tariya "grain" buudai "wheat," arbai "barley," shish "sorghum," am "millet," budaa "grain," anjisu "plow" mill "teerem" etc) are consistent throughout the Mongolic family, and have great time depth — most of them are not obviously loan words from any other language (some have Turkic cognates, but at a considerable time depth).
Read the rest of this entry »
Permalink