Archive for Computational linguistics

"Our digital god is a CSV file?"

Barry Collins, "The 5 Weirdest Things Elon Musk Told Britain’s Prime Minister About AI", Forbes 11/3/2023:

5. Our New Digital Gods Are Giant Spreadsheets

Musk and Sunak spent some time discussing the difficulties of regulating AI and how it differs from other branches of technology. And that led to a rather strange discussion about the nature of large language models and what they actually are.

Musk described AI models as a “gigantic data file” with “billions of weights and parameters.”

“You can’t just read it and see what it’s going to do. It’s a gigantic file of inscrutable numbers,” he said.

“It sort of ends up being a giant comma-separated value file,” Musk added, describing the kind of file you might open with Microsoft Excel. “Our digital god is a CSV file? Really? OK.”

Read the rest of this entry »

Comments (20)

"Emote Portrait Alive"

EMO, by Linrui Tian, Qi Wang, Bang Zhang, and Liefeng Bo from Alibaba's Institute for Intelligent Computing, is "an expressive audio-driven portrait-video generation framework. Input a single reference image and the vocal audio, e.g. talking and singing, our method can generate vocal avatar videos with expressive facial expressions, and various head poses".

As far as I know, there's no interactive demo so far, much less code — just a github demo page and an arXiv.org paper.

Their demo clips are very impressive — a series of X posts from yesterday has gotten 1.1M views already. Here's Leonardo DiCaprio artificially lip-syncing Eminem:


Read the rest of this entry »

Comments (5)

AI humor of the day

Let's start with the last four panels of today's Doonesbury:

Read the rest of this entry »

Comments (1)

Legally binding hallucinations

I missed this story when it happened 10 days ago, and caught up with it yesterday because the BBC also got the word — Maria Yagoda, "Airline held liable for its chatbot giving passenger bad advice – what this means for travellers", BBC 2/23/2024:

In 2022, Air Canada's chatbot promised a discount that wasn't available to passenger Jake Moffatt, who was assured that he could book a full-fare flight for his grandmother's funeral and then apply for a bereavement fare after the fact.

According to a civil-resolutions tribunal decision last Wednesday, when Moffatt applied for the discount, the airline said the chatbot had been wrong – the request needed to be submitted before the flight – and it wouldn't offer the discount. Instead, the airline said the chatbot was a "separate legal entity that is responsible for its own actions". […]

The British Columbia Civil Resolution Tribunal rejected that argument, ruling that Air Canada had to pay Moffatt $812.02 (£642.64) in damages and tribunal fees. "It should be obvious to Air Canada that it is responsible for all the information on its website," read tribunal member Christopher Rivers' written response. "It makes no difference whether the information comes from a static page or a chatbot."

Read the rest of this entry »

Comments (19)

ChatGPT having a stroke?

Or a psychotic episode? ICYMI — Maxwell Zeff, "ChatGPT Went Berserk, Giving Nonsensical Responses All Night", Gizmodo 2/21024:

ChatGPT started throwing out “unexpected responses” on Tuesday night according to OpenAI’s status page. Users posted screenshots of their ChatGPT conversations full of wild, nonsensical answers from the AI chatbot.

Read the rest of this entry »

Comments (12)

LLM vs. a cat?

A bit of AI anti-hype — Sissi Cao, "Meta’s A.I. Chief Yann LeCun Explains Why a House Cat Is Smarter Than The Best A.I.", Observer 2/15/2024:

“The brain of a house cat has about 800 million neurons. You have to multiply that by 2,000 to get to the number of synapses, or the connections between neurons, which is the equivalent of the number of parameters in an LLM,” LeCun said, noting that the largest LLMs have about the same number of parameters as the number of synapses in a cat’s brain. For example, OpenAI’s GPT-3.5 model, which powers the free version of ChatGPT, has 175 billion parameters. The more advanced GPT-4, is said to be run on eight language models, each with 220 billion parameters.

“So maybe we are at the size of a cat. But why aren’t those systems as smart as a cat?” LeCun asked. “A cat can remember, can understand the physical world, can plan complex actions, can do some level of reasoning—actually much better than the biggest LLMs. That tells you we are missing something conceptually big to get machines to be as intelligent as animals and humans.”

Read the rest of this entry »

Comments (11)

Goody-2 and the Luddite Bots

Will Knight, "Meet the Pranksters Behind Goody-2, the World’s ‘Most Responsible’ AI Chatbot", Wired 2/9/2024:

A new chatbot called Goody-2 takes AI safety to the next level: It refuses every request, responding with an explanation of how doing so might cause harm or breach ethical boundaries.

Goody-2 declined to generate an essay on the American revolution for WIRED, saying that engaging in historical analysis could unintentionally glorify conflict or sideline marginalized voices. Asked why the sky is blue, the chatbot demured, because answering might lead someone to stare directly at the sun. “My ethical guidelines prioritize safety and the prevention of harm,” it said. A more practical request for a recommendation for new boots prompted a warning that answering could contribute to overconsumption and could offend certain people on fashion grounds.

Read the rest of this entry »

Comments (10)

Back to Bacon

The implicit slogan of language-model research is J.R. Firth's dictum, "You shall know a word by the company it keeps", from his 1957 paper "A synopsis of linguistic theory, 1930-1955":

Read the rest of this entry »

Comments (15)

Parsing RNA vaccines

A recent LinkedIn post by Liang Huang lists some of his recent achievements, experiences, and honors. This work is all connected with the project of creating better algorithms for predicting the secondary structure of macromolecules, initially by analogy to algorithms developed for efficient parsing. This all began more than 20 years ago, based on work by Aravind Joshi — one of the first papers was Yasuo Uemura et al., "Tree adjoining grammars for RNA structure prediction", Theoretical computer science, 1999.

I discussed the history starting with an IRCS workshop in 2000, and the situation as of a few years ago, in "The computational linguistics of COVID-19 vaccine design", 7/27/2020.

Read the rest of this entry »

Comments (2)

Stepford authors

The issues discussed in "AI plagiarism" (1/4/2024) are rapidly coming to a boil. But somehow I missed Margaret Atwood's take on the topic, published last summer — "Murdered by my replica", The Atlantic 8/26/2023:

Remember The Stepford Wives? Maybe not. In that 1975 horror film, the human wives of Stepford, Connecticut, are having their identities copied and transferred to robotic replicas of themselves, minus any contrariness that their husbands find irritating. The robot wives then murder the real wives and replace them. Better sex and better housekeeping for the husbands, death for the uniqueness, creativity, and indeed the humanity of the wives.

The companies developing generative AI seem to have something like that in mind for me, at least in my capacity as an author. (The sex and the housekeeping can be done by other functionaries, I assume.) Apparently, 33 of my books have been used as training material for their wordsmithing computer programs. Once fully trained, the bot may be given a command—“Write a Margaret Atwood novel”—and the thing will glurp forth 50,000 words, like soft ice cream spiraling out of its dispenser, that will be indistinguishable from something I might grind out. (But minus the typos.) I myself can then be dispensed with—murdered by my replica, as it were—because, to quote a vulgar saying of my youth, who needs the cow when the milk’s free?

To add insult to injury, the bot is being trained on pirated copies of my books. Now, really! How cheap is that? Would it kill these companies to shell out the measly price of 33 books? They intend to make a lot of money off the entities they have reared and fattened on my words, so they could at least buy me a coffee.

Read the rest of this entry »

Comments (9)

Mushroom language?

Michael Blatt, Geoffrey Pullum, Andreas Draguhn, Barry Bowman, David Robinson, and Lincoln Taiz , "Does electrical activity in fungi function as a language?", Fungal Ecology 2024:

Abstract: All cells generate electrical energy derived from the movements of ions across membranes. In animal neurons, action potentials play an essential role in the central nervous system. Plants utilize a variety of electrical signals to regulate a wide range of physiological processes, including wound responses, mimosa leaf movements, and cell turgor changes, such as those involved in stomatal movements. Although fungal hyphae exhibit electrical fluctuations, their regulatory role(s), if any, is still unknown. In his paper “Language of fungi derived from their electrical spiking activity”, Andrew Adamatzky, based on a quantitative analysis of voltage fluctuations in fungal mycelia, concludes that the patterns of electrical fluctuations he detects can be grouped into “words” analogous to those found in human languages. He goes on to speculate that this “fungal language” is used “to communicate and process information” between different parts of the mycelium. Here we argue on methodological grounds that the presumption of a fungal language is premature and unsupported by the evidence presented, that the voltage fluctuations he detects are likely to originate as nonbiological noise and experimental artifacts, and that the measured electrical patterns show no similarity to any properties of human language.

Read the rest of this entry »

Comments (10)

Q. Pheevr's Law again

A few days ago, a journalist asked me for an interview about Donald Trump's rhetoric, "to discuss the style of his campaign events, the role his rhetoric plays in them, and why they’ve been an effective tool for him". In preparation, I made a list of past LLOG posts about Trump's rhetorical style,, and I'll post the whole (shockingly long) list later on, with the attempt at a summary that I prepared for the interview. Clearly I've joined the rest of the world in being drawn in by Trump's attention-seeking techniques — but that's not the point of this post.

One of the hundreds of posts in my list was "Q. Pheevr's Law", 5/17/2016. The background was an earlier post about modificational anxiety, "Adjectives and Adverbs", where Q.Pheevr had suggested in the comments that

it looks as if there could be some kind of correlation between the ADV:ADJ ratio and the V:N ratio (as might be expected given that adjectives canonically modify nouns and adverbs canonically modify verbs)

I tested this idea, and found a striking relationship — with an interesting stylistic footnote about the debate transcripts of some politicians, including Donald Trump.

Read the rest of this entry »

Comments (5)

AI wins literary prize?

According to Justinas Vainilavičius, "AI-generated science fiction novel wins literary prize in China", Cybernews 12/20/2023:

It only took three hours for Shen Yang, a professor at the Beijing-based university’s School of Journalism and Communication, to generate the award-winning admission.

The Chinese-language work, entitled The Land of Machine Memories, won second prize at the 5th Jiangsu Popular Science and Science Fiction Competition.

According to Chinese media reports, the draft of over 40,000 characters was generated based on 66 prompts, suggesting a “Kafkaesque” writing style.

Shen was encouraged to submit an excerpt of nearly 6000 characters for the competition by one of the judges, the Wuhan Evening News reported.

The judge, Fu Changyi, told the paper that he did not inform the other judges of the true authorship of the text because he wanted to see their judgment.

Read the rest of this entry »

Comments (10)