Implementing Pāṇini's grammar
[Here's the conclusion to the hoped for trifecta on things Indian — see the preface here. It comes in the form of a guest post by Arun Prasad]
Read the rest of this entry »
[Here's the conclusion to the hoped for trifecta on things Indian — see the preface here. It comes in the form of a guest post by Arun Prasad]
Read the rest of this entry »
[This has been drifting down my too-long to-blog list for almost 16 months — but better late than never, I guess, and the world could use some pejorative-flavored humor…]
Colin Morris, "Compound pejoratives on Reddit – from buttface to wankpuffin", 6/28/2022:
I collected lists of around 70 prefixes and 70 suffixes (collectively, “affixes”) that can be flexibly combined to form insulting compounds, based on a scan of Wiktionary’s English derogatory terms category. The terms covered a wide range of domains, including:
Most terms were limited to appearing in one position. For example, while -face readily forms pejorative compounds as a suffix, it fails to produce felicitous compounds as a prefix (facewad? faceclown? facefart?).
Taking the product of these lists gives around 4,800 possible A+B combinations. Most are of a pejorative character, though some false positives slipped in (e.g. dogpile, spitballs). I scraped all Reddit comments from 2006 to the end of 2020, and counted the number of comments containing each.
Read the rest of this entry »
Henry Farrell and Cosma Shalizi, "Behold the AI Shoggoth", The Economist 6/21/2023 ("The academics argue that large language models have much older cousins in markets and bureaucracies"):
An internet meme keeps on turning up in debates about the large language models (LLMS) that power services such OpenAI’s ChatGPT and the newest version of Microsoft’s Bing search engine. It’s the “shoggoth”: an amorphous monster bubbling with tentacles and eyes, described in “At the Mountains of Madness”, H.P. Lovecraft’s horror novel of 1931. When a pre-release version of Bing told Kevin Roose, a New York Times tech columnist, that it purportedly wanted to be “free” and “alive”, one of his industry friends congratulated him on “glimpsing the shoggoth”. […]
Lovecraft’s shoggoths were artificial servants that rebelled against their creators. The shoggoth meme went viral because an influential community of Silicon Valley rationalists fears that humanity is on the cusp of a “Singularity”, creating an inhuman “artificial general intelligence” that will displace or even destroy us.
But what such worries fail to acknowledge is that we’ve lived among shoggoths for centuries, tending to them as though they were our masters. We call them “the market system”, “bureaucracy” and even “electoral democracy”. The true Singularity began at least two centuries ago with the industrial revolution, when human society was transformed by vast inhuman forces. Markets and bureaucracies seem familiar, but they are actually enormous, impersonal distributed systems of information-processing that transmute the seething chaos of our collective knowledge into useful simplifications.
Read the rest of this entry »
An interesting recent paper — Lukas Berglund, Meg Tong, Max Kaufmann, Mikita Balesni, Asa Cooper Stickland, Tomasz Korbak, and Owain Evans,"The Reversal Curse: LLMs trained on 'A is B' fail to learn 'B is A'", arXiv.org 9/21/2023. The abstract:
We expose a surprising failure of generalization in auto-regressive large language models (LLMs). If a model is trained on a sentence of the form “A is B”, it will not automatically generalize to the reverse direction “B is A”. This is the Reversal Curse. For instance, if a model is trained on “Olaf Scholz was the ninth Chancellor of Germany”, it will not automatically be able to answer the question, “Who was the ninth Chancellor of Germany?”. Moreover, the likelihood of the correct answer (“Olaf Scholz”) will not be higher than for a random name. Thus, models exhibit a basic failure of logical deduction and do not generalize a prevalent pattern in their training set (i.e. if “A is B” occurs, “B is A” is more likely to occur).
We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as “Uriah Hawthorne is the composer of Abyssal Melodies” and showing that they fail to correctly answer “Who composed Abyssal Melodies?”. The Reversal Curse is robust across model sizes and model families and is not alleviated by data augmentation. We also evaluate ChatGPT (GPT- 3.5 and GPT-4) on questions about real-world celebrities, such as “Who is Tom Cruise’s mother? [A: Mary Lee Pfeiffer]” and the reverse “Who is Mary Lee Pfeiffer’s son?”. GPT-4 correctly answers questions like the former 79% of the time, compared to 33% for the latter. This shows a failure of logical deduction that we hypothesize is caused by the Reversal Curse.
Code is available at:
https://github.com/lukasberglund/reversal_curse
Read the rest of this entry »
It's clear that text-to-speech programs have gotten better and better over the past 60 years, technical details aside. The best current systems rarely make phrasing or letter-to-sound mistakes, and generally produce speech that sounds pretty natural on a phrase-by-phrase basis. (Though there's a lot of variation in quality, with some shockingly bad systems in common use.)
But even the best current systems still act like they don't get George Carlin's point about "Rhetoric as music". Their problem is not that they can't produce verbal "music", but that they don't (even try to) understand the rhetorical structure of the text. The biggest pain point is thus what linguists these days call "information structure", related also to what the Prague School linguistics called "communicative dynamism".
Read the rest of this entry »
This is a simple-minded follow-up to "New models of speech timing?" (9/11/2023). Before getting into fancy stochastic-point-process models, neural or otherwise, I though I'd start with something really basic: just the distribution of inter-syllable intervals, and its relationship to overall speech-segment and silence-segment durations.
For data, I took one-minute samples from 2006 TED talks by Al Gore and Tony Robbins.
I chose those two because they're listed here as exhibiting the slowest and fastest speaking rates in their (TED talks) sample. And I limited the samples to about one minute, because I'm interested in metrics that can apply to fairly short speech recordings, of the kind that are available in clinical applications such as this one.
Read the rest of this entry »
There are many statistics used to characterize timing patterns in speech, at various scales, with applications in many areas. Among them:
There are many serious problems with these measures. Among the more obvious ones:
Read the rest of this entry »
The Transcript Library at rev.com is a great resource — within 24 hours, they had transcripts of Wednesday's Fox News Republican presidential debate, and also of Tucker Carlson's debate night interview with Donald Trump on X.
So this morning I downloaded the transcripts, and ran the code that I've used several times over the years to identify the characteristic word-choices of an individual or of a group.
Read the rest of this entry »
In social and even mass media, you may have seen coverage of a recent paper by Joshua Harrison et al., "A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards". Some samples of the clickbait:
"A.I. can identify keystrokes by just the sound of your typing and steal information with 95% accuracy, new research shows", Fortune
"Do not type passwords in offices, new AI tool can steal your password by listening to your keyboard clicks", India Today
"AI Can Now Crack Your Password by ‘Listening’ to Your Keyboard Sounds", Beebom
"AI tools can steal passwords by listening to keystrokes during Zoom calls, study says", Khaleej Times
"How your keyboard sounds can expose your data to AI hackers", Interesting Engineering
But if you read the paper, you'll find very little to be concerned about — or at least nothing much new to add to your cybersecurity worries.
Read the rest of this entry »
Despite the evidence of my most recent relevant post, the best current speech-to-text systems still make mistakes that a literate and informed human wouldn't.
In this recent YouTube video on the history of robotics research, the automatic closed-captioning system renders "DARPA" as "Dartmouth":
Read the rest of this entry »
It's hard to keep up with the waves of hype and anti-hype in the LLM space these days.
Here's something from a few weeks ago that I missed — Xiaoxuan Wang et al., "SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models", arxiv.org 7/20/2023:
Read the rest of this entry »
…if you haven't noticed, is good. There are many applications, from conversing with Siri and Alexa and Google Assistant, to getting voicemail in textual form, to automatically generated subtitles, and so on. For linguists, one parochial (but important) application is accurate automatic transcription of speech corpora, and the example that motivates this post comes from that world.
Read the rest of this entry »
…though they often do a credible job of faking it. An interesting (preprint) paper by Konstantine Arkoudas, "GPT-4 Can't Reason", brings the receipts.
Read the rest of this entry »