Archive for Uncategorized

BoJo bamboozled

From Philip Taylor:

The British media were flooded yesterday with reports that former Prime Minister Boris Johnson had been “bamboozled” by scientific evidence presented during the Covid-19 pandemic.  My understanding of "bamboozle" has always been that deception must be involved, and this is borne out by the OED, but there was clearly no deception in this case (other than, perhaps, self-deception, in that BoJo may well have convinced himself that he did understand the scientific evidence, when he clearly did not), so why did Sir Patrick Valance, then Chief Scientific Advisor to HMG, record in his diary that “the Prime Minister was at times ‘bamboozled’” ?

Read the rest of this entry »

Comments (9)

The history of "artificial intelligence"

The Google Books ngram plot for "artificial intelligence" offers a graph of AI's culturomics:

According to the OED, the first use of the term artificial intelligence was in a 13-page grant application by John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon, "A proposal for the Dartmouth summer research project on artificial intelligence", written in the summer of 1955:

We propose that a 2 month, 10 man study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire. The study is to proceed on the the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.

The proposal uses the phrase repeatedly without quotation marks, capitalization, or any other indication of its status as a neologism, suggesting that it was in common conversational usage before that (apparently) first publication, and/or that the authors thought its compositional meaning was obvious.

There's no question that the concept had been under discussion for a decade or so at that point, with analogous ideas to be found hundreds of years earlier. And there are older uses of the phrase "artificial intelligence", in interestingly divergent contexts, also going back hundreds of years.

Read the rest of this entry »

Comments (17)

The changing accents of British English

King’s English and Cockney replaced by three new accents, study finds

Britons depart from overtly class-based post-war speech epitomised by either clipped vowels or working-class dialects

By Charles Hymas, The Telegraph, Home Affairs Editor 

I vaguely recall an earlier study from about ten years ago that came to similar conclusions (including the emergence of a "multicultural" accent).  It's not surprising that differences would gradually diminish, especially under the influence of enhanced, pervasive mass communications and increased population mobility.

What we see, though, is that, as the older, established accents wither away, new ones arise among various shifting cultural, ethnic, and social regroupings.

Remember the Valley Girl accent, which people used to talk about a lot ten or twenty years ago?  Where is it now?

Read the rest of this entry »

Comments (35)

Black Hand: Language Log foretells the future

From Brian Miller:

I believe it was your comment here on a 2019 use of a phrase in China politics or press

“Thus my second surmise was that, by 'black hand', the CCP / PRC mean 'stealthy manipulator who remains totally out of view'.  But how does it get that meaning in Chinese?”

Read the rest of this entry »

Comments (8)

Viral vibe

"Chinese Song Streamed Billions of Times for ‘Satirical’ Vibe"

Yomiuri Shimbun (August 29, 2023)

Here's the song, with the lyrics in characters, pinyin romanization, and a poor English translation:

Read the rest of this entry »

Comments (7)

Tortured phrases, LLMs, and Goodhart's Law

A few years ago, I began to notice that the scientific and technical papers relentless spammed at me, by and similar outfits, were becoming increasingly surrealistic. And I soon learned that the source for such articles was systems for "article spinning" by "rogeting" — automatic random subsitution of (usually inappropriate) synonyms. Those techniques were originally developed many years ago for spamdexing, i.e. generating "link farms" of fake pages, in order to fool search engine ranking systems by evading simple forms of content similarity detection,

And the same techniques also fool simple systems for plagiarism detection — though the incoherent results are not useful for student papers, at least in cases where instructors actually read the submissions. But the same time period saw the parallel growth of predatory publishing (and analogous developments among generally reputable publishers), and the use of mindless quantitative publication metrics to evaluate researchers, faculty and institutions. The result: an exponential explosion of "tortured phrases" in the scientific, technical, and scholarly literature: "talk affirmation" for "speech recognition", "straight expectation" for "linear prediction", "huge information" for "big data", "gullible Bayes" for "naive Bayes",  "irregular woodland" for "random forest", "savvy home" for "smart home", and so on.

Read the rest of this entry »

Comments (3)

Apostrophes in Hanyu Pinyin

The most famous instance of the use of an apostrophe in Hanyu Pinyin romanization is in the place name "Xi'an", the capital of Shaanxi (the doubled "a" is another story) Province.

Xī'ān 西安 — two characters signifying "Western Peace"

If you don't use an apostrophe to separate the syllables, you end up with the monosyllable "xian", which — depending upon the tone and the character it is meant to represent — could mean dozens of different things.

Mark Swofford has carried out an interesting investigation on "Mandarin words with more than one apostrophe", Pinyin News (6/11/23).

Read the rest of this entry »

Comments (3)

Annals of inventive pinyin: rua

This exercise video shows a woman repeating the syllable "rua" to describe a move that she makes:

Read the rest of this entry »

Comments (7)

"Failure to Launch"?

Along with half a million other people, I logged onto Twitter at the designated hour to hear Elon Musk help Ron DeSantis announce his run for U.S. President. After about half an hour of  noises, silences, and puzzling graphics, I gave up — too early to catch the restart on a different account.

This event was generally covered as an embarrassing failure, with Twitter tags like #DeSaster and #FailureToLaunch. A few hours later, I checked again, and was able to find the Twitter Spaces recording of the rebooted event — which I found less entertaining than the initial parade of glitches, alas. But I also found this:

Read the rest of this entry »

Comments (5)

Iowa town names

I'm in Ames, home of Iowa State University.  The next town down the road is Nevada.  What?  Yes, but it's /nəˈvdə/ nə-VAY-də, not /nɪˈvædə/ nih-VAD; Spanish: [neˈβaða], and the locals I've met know the difference.  The same thing holds for Madrid, which is on the other side of Ames; it is /ˈmædrɪd/, not /məˈdrɪd/ mə-DRID, Spanish: [maˈðɾið].

From what they told me, Iowans do the same thing with many other exonyms.

Read the rest of this entry »

Comments (42)

Japanese book formats

Two days ago, a Penn freshman from China gifted me with a small format edition of the Guǐgǔzi 鬼谷子 (Master of Ghost Valley), a text that has long intrigued me.

Guiguzi (鬼谷子) is a collection of ancient Chinese texts compiled between the late Warring States period and the end of the Han Dynasty. The work, between 6,000–7,000 Chinese characters, discusses techniques of rhetoric. Although originally associated with the School of Diplomacy, the Guiguzi was later integrated into the Daoist canon.


Not only was I pleased by the content of the book, I was also charmed by its appearance.  Over the long decades of my career as a Sinologist, I have purchased thousands of Chinese books, but I had never seen one quite like this.  It has fine printing on good quality paper with a classy cover.  Its dimensions are small, 6 7/8ths inches (174.625 mm) by 4 1/4 inches (107.95 mm).  Published in 2015 (reissued 2019) (ISBN 978-7-101-10697-8) by the famous Chinese publishing house Zhōnghuá Shūjú 中华书局 (Chung Hwa Book Co.), it is part of a relatively new series called Zhōnghuá jīngdiǎn zhǐzhǎng wénkù 中华经典指掌文库 (Chung Hwa Classics Series for the Palm).  All the several dozen volumes in this series are premodern classics.

Read the rest of this entry »

Comments (5)

Syllable rhythm in English and Mandarin

I've always been skeptical of the distinction between "stress-timed" and "syllable-timed" languages, at least as a claim about the phonetic facts of speech timing as opposed to the psychological dimensions of speech production and perception. Syllable durations in all languages vary widely, due to differences in the intrinsic durations of different vowels and consonants, the effects of phrasal position and emphasis, and many other factors. As a result, inter-stress intervals in languages like English or German are not actually "isochronous", and neither are inter-syllable intervals in languages like French or Spanish. And it's not even true that speakers generally make such intervals closer to isochronous than the relevant timing factors would otherwise predict.

But in "Speech rhythms and brain rhythms", 12/2/2013, I showed a plot of the average syllable-scale power spectrum in the 6300 American-English sentences in the TIMIT dataset, which indicated a key periodicity at 2.4 Hz. I noted that "2.4 Hz corresponds to a period of 417 msec, which is too long for syllables in this material. In fact, the TIMIT dataset as a whole has 80363 syllables in 16918.1 seconds, for an average of 210.5 msec per syllable, so that 417 msec is within 1% of the average duration of two syllables. […] One hypothesis might be that this somehow reflects the organization of English speech rhythm into 'feet' or 'stress groups', typically consisting of a stressed syllable followed by one or more unstressed syllables."

I added that "Unfortunately there aren't any datasets comparable to TIMIT in other languages; but I'll see what I can come up with as a more-or-less parallel test in languages that are said to be 'syllable timed' rather than 'stress timed." Almost ten years later, I've never delivered on that promise, though it would have been easy to do so. So for today's Breakfast Experiment™ I'll show the same analysis for the 6300 sentences in the recently-published Global TIMIT Mandarin Chinese dataset.

Read the rest of this entry »

Comments (23)

Kůlp Månifesto

Recently, a package from Canada arrived at the Penn Linguistics Department — though it was addressed to

Dept. Di Linggwistika
U. Di Pensilvania
Fiiladelfia, Pa
19,104, U.S.Å

It contained multiple copies, on variously-colored paper, of an odd 11-page document.

Read the rest of this entry »

Comments (39)