Language Log

Bibliographical cornucopia for linguists, part 1

July 14, 2025 @ 9:58 am · Filed by Victor Mair under Animal behavior, Animal communication, Bibliography, Emojis and emoticons, Gesture, Words words words

Bibliographical cornucopia for linguists, part 1

Since we have such an abundance of interesting articles for this fortnight, I will divide the collection into two parts, and provide each entry with an abstract or paragraph length quotation.

"Word Learning as Category Formation." Caplan, Spencer. PLOS ONE 20, no. 7 (July 3, 2025): e0327615. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0327615.

A fundamental question in word learning is how, given only evidence about what objects a word has previously referred to, children are able to generalize to the correct class. How does a learner end up knowing that “poodle” only picks out a specific subset of dogs rather than the broader class and vice versa? Numerous phenomena have been identified in guiding learner behavior such as the “suspicious coincidence effect” (SCE)—that an increase in the sample size of training objects facilitates more narrow (subordinate) word meanings. While SCE seems to support a class of models based in statistical inference, such rational behavior is, in fact, consistent with a range of algorithmic processes. Notably, the broadness of semantic generalizations is further affected by the temporal manner in which objects are presented—either simultaneously or sequentially. First, I evaluate the experimental evidence on the factors influencing generalization in word learning. A reanalysis of existing data demonstrates that both the number of training objects and their presentation-timing independently affect learning. This independent effect has been obscured by prior literature’s focus on possible interactions between the two. Second, I present a computational model for learning that accounts for both sets of phenomena in a unified way. The Naïve Generalization Model (NGM) offers an explanation of word learning phenomena grounded in category formation. Under the NGM, learning is local and incremental, without the need to perform a global optimization over pre-specified hypotheses. This computational model is tested against human behavior on seven different experimental conditions for word learning, varying over presentation-timing, number, and hierarchical relation between training items. Looking both at qualitative parameter-independent behavior and quantitative parameter-tuned output, these results support the NGM and suggest that rational learning behavior may arise from local, mechanistic processes rather than global statistical inference.

Read the rest of this entry »

Permalink Comments

Recursive summarization

July 14, 2025 @ 5:20 am · Filed by Mark Liberman under Linguistics in the comics

Today's SMBC:

Read the rest of this entry »

Permalink Comments (3)

The effect of AI tools on coding

July 13, 2025 @ 7:09 am · Filed by Mark Liberman under Artificial intelligence

Joel Becker et al., "Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity", METR 7/10/2025:

Despite widespread adoption, the impact of AI tools on software development in the wild remains understudied. We conduct a randomized controlled trial (RCT) to understand how AI tools at the February–June 2025 frontier affect the productivity of experienced open-source developers. 16 developers with moderate AI experience complete 246 tasks in mature projects on which they have an average of 5 years of prior experience. Each task is randomly assigned to allow or disallow usage of early-2025 AI tools. When AI tools are allowed, developers primarily use Cursor Pro, a popular code editor, and Claude 3.5/3.7 Sonnet. Before starting tasks, developers forecast that allowing AI will reduce completion time by 24%. After completing the study, developers estimate that allowing AI reduced completion time by 20%. Surprisingly, we find that allowing AI actually increases completion time by 19%—AI tooling slowed developers down. This slowdown also contradicts predictions from experts in economics (39% shorter) and ML (38% shorter). To understand this result, we collect and evaluate evidence for 20 properties of our setting that a priori could contribute to the observed slowdown effect—for example, the size and quality standards of projects, or prior developer experience with AI tooling. Although the influence of experimental artifacts cannot be entirely ruled out, the robustness of the slowdown effect across our analyses suggests it is unlikely to primarily be a function of our experimental design.

Read the rest of this entry »

Permalink Comments (7)

Asterisk the Gaul

July 13, 2025 @ 5:49 am · Filed by Victor Mair under Humor, Multilingualism, Pronunciation, Punctuation, Translation

A learned friend recently sent me a draft composition on medieval Chinese history in which he referred to "*" as an "asterix". This reminded me that ten years ago I wrote a post, "The many pronunciations of '*'" (12/17/15), on this subject and we had a lengthy, vigorous discussion about it.

Given that lately we've been talking a lot about Celts, Galatians, and so on, I think it is appropriate to write another post on Asterix the Gaul, that famous French comic book character, and how he got his name. Also inspired / prompted by Chris Button's latest comment.

I often hear "*" pronounced "asterix" or "asterick", and so on (e.g., "astrisk" [two syllables], esp. in rapid speech). It's hard even for me to pronounce "*" or type the symbol those ways, so ingrained is the pronunciation "as-ter-isk".

Read the rest of this entry »

Permalink Comments (16)

Steele v. Monboddo

July 12, 2025 @ 11:40 am · Filed by Mark Liberman under Prosody

In "AI win of the week" I explored the inter-personal dimensions of Rousseau's 1754 contention that "there is neither rhythm nor melody in French music, because the language is not capable of them". In the comments, AntC objected that "But, but. Rousseau wrote an opera, in French, to his own Libretto. audio + full score available on Youtube".

For now, I have only two comments on this. First, trolls are often happy to abandon consistency in the service of pwning their audience. And second, the 1754 edition of Rousseau's screed, published two years after the debut of his opera, goes into considerable detail about how he painfully transferred the musicality of Italian prosody to the composition and performance of a work with French lyrics.

But rather than diving further into Rousseau's argument about the relative musicality of different languages' prosody, the point of today's post is to note its resonance with another mid-18th century prosodic dispute, namely Joshua Steele's refutation of James Burnett's claim that English prosody gives its syllables "nothing better than the music of a drum, in which we perceive no difference except that of louder or softer, according as the instrument is more or less forcibly struck".

Read the rest of this entry »

Permalink Comments (12)

Tâigael, part 2

July 11, 2025 @ 5:05 pm · Filed by Victor Mair under Alphabets, Language and religion, Language teaching and learning

[This is a guest post by Chau Wu]

The Presbyterian Church in Taiwan (TPC) was first planted by British missionaries in Tainan, which later expanded to all southern parts of Taiwan, constituting the present Southern Synod of TPC. The most important pioneer among them was the Scottish missionary Rev. Thomas Barclay who worked in Taiwan-Fu (the present Tainan). He was born in Glasgow, and matriculated at the University of Glasgow. While there, he studied under Sir William Thomson, later Lord Kelvin [according to Wikipedia]. The celebrated Lord Kelvin reminds me of the absolute zero degree in physical chemistry and the electric cable equation as the underpinning of the Transatlantic cable as well as the conduction of electric impulses along nerve fibers.

Read the rest of this entry »

Permalink Comments (1)

Who were the Galatians? How did they get where they were?, part 2

July 11, 2025 @ 4:23 pm · Filed by Victor Mair under Language and art, Language and history, Language and religion

As announced in the title of the first post on this subject, my aim is to understand where the Galatians originated and how / why they migrated to where they were when Apostle Paul wrote his epistle to them. Since I was apparently insufficiently clear about both of those purposes in part 1, in this follow-up post I will provide additional scholarly material. Inasmuch as the identification of the Gauls / Celts and the languages they spoke will be important for several posts about them that I will write in the coming weeks, today's post will necessarily be long and detailed.

Here I will quote from Ben Witherington III, Grace in Galatia: A Commentary on St. Paul’s Letter to the Galatians (Grand Rapids, MI: Wm. B. Eerdmans Publishing Co., 1998), pp. 1-7.

N.B.: Illustration for art historians below.

Read the rest of this entry »

Permalink Comments (13)

AI win of the day

July 11, 2025 @ 3:38 pm · Filed by Mark Liberman under Artificial intelligence

In "Beautiful music and logical warts", I quoted (part of) the trollish conclusion of Rousseau's Lettre sur la Musique Française:

Je crois avoir fait voir qu’il n’y a ni mesure ni mélodie dans la musique française, parce que la langue n’en est pas susceptible ; que le chant français n’est qu’un aboiement continuel, insupportable à toute oreille non prévenue; que l’harmonie en est brute, sans expression, et sentant uniquement son remplissage d'écolier ; que les airs français ne sont point des airs ; que le récitatif français n’est point du récitatif. D’où je conclus que les Français n’ont point de musique et n’en peuvent avoir, ou que, si jamais ils en ont une, ce sera tant pis pour eux.

I believe I have shown that there is neither rhythm nor melody in French music, because the language is not capable of them; that French song is only a continual barking, unbearable to any unbiased ear; that the harmony is crude, without expression, and full of childish padding; that French airs are not airs; that French recitative is not recitative. From which I conclude that the French have no music and never will have any, or that, if ever they have some, it will be a disappointment for them.

Read the rest of this entry »

Permalink Comments (5)

Beautiful music and logical warts

July 11, 2025 @ 7:44 am · Filed by Mark Liberman under Etymology

In "Rococo" (7/6/2025), I quoted from Charles Carr's 1965 paper "TWO WORDS IN ART HISTORY II. ROCOCO" his evidence that the word rococo began as way of denigrating certain kinds of out-of-fashion ugliness. Jonathan Smith noted in the comments that "baroque itself was first a(n) (disparaging) epithet", and I quoted the OED's endorsement of that idea, though without going into the whole "an irregular pearl is like a wart" background.

But in a parallel 1965 article, "TWO WORDS IN ART HISTORY I. BAROQUE", Charles Carr lays out three etymological theories about baroque, after sparing us "fantastic etymologies to be found in certain eighteenth-century dictionaries".

Read the rest of this entry »

Permalink Comments (6)

Spinach: Mongolian rhapsody

July 10, 2025 @ 6:34 pm · Filed by Victor Mair under Borrowing, Etymology, Language and biology, Language and food

[This is a guest post from Christopher Atwood]

Building on observations of Andras Rona-Tas (Tibeto-Mongolica, pp. 213-14), one can observe a basic division in Mongolian words for cultivated plants. They divide into two types: 1) words for grains and grain cultivation; and 2) words for fruits and vegetables.

Words in the first category (tariya "grain" buudai "wheat," arbai "barley," shish "sorghum," am "millet," budaa "grain," anjisu "plow" mill "teerem" etc) are consistent throughout the Mongolic family, and have great time depth — most of them are not obviously loan words from any other language (some have Turkic cognates, but at a considerable time depth).

Read the rest of this entry »

Permalink Comments (12)

Xanadu meme

July 10, 2025 @ 1:00 pm · Filed by Victor Mair under Language and literature, Memes

[This is a guest post by Bill Benzon]

I thought you’d be interested in a study showing the distribution of “Xanadu” across the web. I first looked into this back in 2010. I’ve now updated that work using ChatGPT o3 (one of the so-called “reasoning” models). It designed the study and executed it.

This report ran all night. And it’s the kind of thing that would have been impossible prior to the internet. Here’s the abstract:

Read the rest of this entry »

Permalink Comments (10)

Defining "skedaddle"

July 10, 2025 @ 8:46 am · Filed by Mark Liberman under Lexicon and lexicography

In the Fox News recording of Donald Trump's 7/8/2025 cabinet meeting, at around 17:33, there's a Walt Whitman-esque description of various historical U.S. raids on Iran, culminating in an interesting example of how to define a word by repeating it with emphatic voice quality.

Read the rest of this entry »

Permalink Comments (9)

Spinach: Indian interlude

July 10, 2025 @ 5:06 am · Filed by Victor Mair under Etymology, Language and biology, Language and food

[This is a guest post by Gábor Parti]

It seems that paalak goes back to Sanskrit, Monier-Williams gives paalakyaa as "Beta bengalensis" (1st column, middle of the page), but I found that the botanical identiications in MW are often dubious. MW also indicates his source as Car(aka), which looks like it refers to the Ayurvedic text of Caraka Samhita.

Beta bengalensis Roxb. is now idenified with the common beet, Beta vulgaris L., which grows in India and all of temperate Europe, and it is in the same familiy as spinach (Amaranthaceae), and beet leaves are also edible.

Wikipedia says that "the ancestor of all current beet cultivars is the sea beet", which then supplies this introduction: "The sea beet, Beta vulgaris subsp. maritima (L.) Arcangeli. is an Old World perennial plant with edible leaves, leading to the common name wild spinach." So far so good.

Read the rest of this entry »

Permalink Comments (5)

« Previous Entries

Language Log

Bibliographical cornucopia for linguists, part 1

Recursive summarization

The effect of AI tools on coding

Asterisk the Gaul

Steele v. Monboddo

Tâigael, part 2

Who were the Galatians? How did they get where they were?, part 2

AI win of the day

Beautiful music and logical warts

Spinach: Mongolian rhapsody

Xanadu meme

Defining "skedaddle"

Spinach: Indian interlude

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta