Language Log

Archive for Artificial intelligence

Astonishing new Google Translate, with the help of generative AI

June 30, 2024 @ 1:12 pm· Filed by Victor Mair under Artificial intelligence, Translation

Google Translate adds Cantonese support, thanks to AI advancement: “Cantonese has long been one of the most requested languages for Google Translate. Because Cantonese often overlaps with Mandarin in writing, it’s tricky to find data and train models,” Google said. By Tom Grundy, Hong Kong Free Press (June 30, 2024).

The Google Translate app has been expanded to include Cantonese, thanks to generative Artificial Intelligence (AI) advancements.

In 2022, Google began using Zero-Shot Machine Translation to expand its pool of supported languages. The machine learning model learns to translate into another language without ever seeing an example, Google said in a Thursday blog post. Now it is using AI to expand the number of supported languages.

It added 110 new languages this week, in its largest-ever expansion, thanks to its PaLM 2 large language model.

Read the rest of this entry »

Permalink Comments (9)

Stochastic parrots extended

June 27, 2024 @ 8:48 am· Filed by Mark Liberman under Artificial intelligence, Computational linguistics

Philip Resnik, "Large Language Models are Biased Because They Are Large Language Models", arXiv.org 6/19/2024:

This paper's primary goal is to provoke thoughtful discussion about the relationship between bias and fundamental properties of large language models. We do this by seeking to convince the reader that harmful biases are an inevitable consequence arising from the design of any large language model as LLMs are currently formulated. To the extent that this is true, it suggests that the problem of harmful bias cannot be properly addressed without a serious reconsideration of AI driven by LLMs, going back to the foundational assumptions underlying their design.

Read the rest of this entry »

Permalink Comments (33)

Unknown language #19

June 16, 2024 @ 10:44 pm· Filed by Victor Mair under Artificial intelligence, Decipherment, Philology

Inscribed sandstone known as the "Singapore Stone", Singapore, 10th–14th century:

Collection of the National Museum of Singapore

(Source; also includes an animated photo that can be rotated 360º in any direction and enlarged or reduced to any size)

Read the rest of this entry »

Permalink Comments (7)

AI plagiarism again

June 13, 2024 @ 6:07 am· Filed by Mark Liberman under Artificial intelligence

Along with concerns about hallucinations and learned bias, there's increasing evidence that generative AI systems sometimes commit what would obviously be plagiarim if a human did it. One particularly striking example is discussed in a recent article by Randall Lane, editor of Forbes Magazine: "Why Perplexity’s Cynical Theft Represents Everything That Could Go Wrong With AI", 6/11/2024:

For most of this year, two of our best journalists, Sarah Emerson and Rich Nieva, have been reporting on former Google CEO Eric Schmidt’s secretive drone project, including a June 6 story detailing the company’s ongoing testing in Silicon Valley suburb Menlo Park as well as the frontlines of Ukraine. The next day, Perplexity published its own “story,” utilizing a new tool they’ve developed that was extremely similar to Forbes’ proprietary article. Not just summarizing (lots of people do that), but with eerily similar wording, some entirely lifted fragments — and even an illustration from one of Forbes’ previous stories on Schmidt. More egregiously, the post, which looked and read like a piece of journalism, didn’t mention Forbes at all, other than a line at the bottom of every few paragraphs that mentioned “sources,” and a very small icon that looked to be the “F” from the Forbes logo – if you squinted. It also gave similar weight to a “second source” — which was just a summary of the Forbes story from another publication.

Read the rest of this entry »

Permalink Comments (2)

ChatGPT is bullshit

June 11, 2024 @ 10:26 am· Filed by Mark Liberman under Artificial intelligence

So say Michael Townsen Hicks, James Humphries & Joe Slater — "ChatGPT is bullshit", Ethics and Information Technology 2024.

The background is Harry Frankfurt's philosophical definition of the term in his essay "On Bullshit":

What bullshit essentially misrepresents is neither the state of affairs to which it refers nor the beliefs of the speaker concerning that state of affairs. Those are what lies misrepresent, by virtue of being false. Since bullshit need not be false, it differs from lies in its misrepresentational intent. The bullshitter may not deceive us, or even intend to do so, either about the facts or about what he takes the facts to be. What he does necessarily attempt to deceive us about is his enterprise. His only indispensably distinctive characteristic is that in a certain way he misrepresents what he is up to.

This is the crux of the distinction between him and the liar. Both he and the liar represent themselves falsely as endeavoring to communicate the truth. The success of each depends upon deceiving us about that. But the fact about himself that the liar hides is that he is attempting to lead us away from a correct apprehension of reality; we are not to know that he wants us to believe something he supposes to be false. The fact about himself that the bullshitter hides, on the other hand, is that the truth-values of his statements are of no central interest to him; what we are not to understand is that his intention is neither to report the truth nor to conceal it. This does not mean that his speech is anarchically impulsive, but that the motive guiding and controlling it is unconcerned with how the things about which he speaks truly are.

Read the rest of this entry »

Permalink Comments (20)

Povinelli et al. on "Reinterpretation"

June 11, 2024 @ 5:20 am· Filed by Mark Liberman under Animal communication, Artificial intelligence

In yesterday's "AI deception?" post, I proposed that we ought to apply to AI an analogy to the philosophical evaluation of "theory of mind" issues in animals. And one of the clearest presentations of that evaluation is in Daniel Povinelli, Jesse Bering, and Steve Giambrone, "Toward a science of other minds: Escaping the argument by analogy" (2000). You should read the whole thing — and maybe look through some of the many works that have cited it. But today I'll just present some illustrative quoted passages.

Read the rest of this entry »

Permalink Comments (3)

AI deception?

June 10, 2024 @ 5:42 am· Filed by Mark Liberman under Animal communication, Artificial intelligence, Pragmatics

Noor Al-Sibai, "AI Systems Are Learning to Lie and Deceive, Scientists Find", Futurism 6/7/2024:

AI models are, apparently, getting better at lying on purpose.

Two recent studies — one published this week in the journal PNAS and the other last month in the journal Patterns — reveal some jarring findings about large language models (LLMs) and their ability to lie to or deceive human observers on purpose.

Read the rest of this entry »

Permalink Comments (11)

Microsoft Copilot goes looking for an obscure sinograph

June 2, 2024 @ 6:14 am· Filed by Victor Mair under Artificial intelligence, Language and computers, Writing systems

and finds it!

Back in early February, I asked the Lexicography at Wenlin Institute discussion group if they could help me find a rare Chinese character in Unicode, and I sent along a picture of the glyph. It won't show up in most browsers, but you can see the character here. You can also see it in the first item of the list of "Selected Readings" below. In the following post, when you see this symbol, , just imagine you're seeing this glyph.

On 2/27/04, Richard Warmington kindly responded as follows:

I asked Microsoft Copilot (a chatbot integrated into Microsoft's Edge browser), "Can you tell me anything about the Chinese character ?"

The answer began as follows:

Certainly! The Chinese character is an intriguing one. Let’s explore it:

1. Character Details:
Character:
Unicode Code Point: U+24B25
[…]

Read the rest of this entry »

Permalink Comments (17)

The future of AI

June 1, 2024 @ 8:42 am· Filed by Mark Liberman under Artificial intelligence

Elon Musk (a long-time Iain Banks fan) recently told the audience at VivaTech 2024 that AI will (probably) bring us Banksian "space socialism" — which Wikipedia explains this way:

The Culture is a society formed by various humanoid species and artificial intelligences about 9,000 years before the events of novels in the series. Since the majority of its biological population can have almost anything they want without the need to work, there is little need for laws or enforcement, and the culture is described by Banks as space socialism. It features a post-scarcity economy where technology is advanced to such a degree that all production is automated. Its members live mainly in spaceships and other off-planet constructs, because its founders wished to avoid the centralised political and corporate power-structures that planet-based economies foster. Most of the planning and administration is done by Minds, very advanced AIs.

Read the rest of this entry »

Permalink Comments (19)

LLMs for judicial interpretation of "ordinary meaning"

May 30, 2024 @ 4:14 pm· Filed by Victor Mair under Artificial intelligence, Language and the law

Kevin Newsom serves as a United States circuit judge of the United States Court of Appeals for the Eleventh Circuit (of which there are a total of 13 across the country; since they are the next level below the Supreme Court, their practices and opinions are of great importance).

Judge Suggests Courts Should Consider Using "AI-Powered Large Language Models" in Interpreting "Ordinary Meaning", Eugene Volokh, The Volokh Conspiracy | 5.30.2024

That's from Judge Kevin Newsom's concurrence yesterday in Snell v. United Specialty Ins. Co.; the opinion is quite detailed and thoughtful, so people interested in the subject should read the whole thing. Here, though, is the introduction and the conclusion:

Read the rest of this entry »

Permalink Comments (5)

OpenAI's Chinese problem

May 26, 2024 @ 3:14 pm· Filed by Victor Mair under Artificial intelligence, Censorship

We have expressed concern over the quality of training and source materials for Chinese AI and LLMs. Less than a week ago, we examined "AI based on Xi Jinping Thought (5/21/24), which may be considered as an attempt to "purify" what goes into Chinese AI. It turns out that there really is a problem, and it is affecting not just China's own AI efforts, but is infecting ours as well.

OpenAI’s latest blunder shows the challenges facing Chinese AI models:
Finding high-quality data sets is tricky because of the way China’s internet functions.
By Zeyi Yang, MIT Technology Review (May 22, 2024)

As we shall soon see, pursuing this topic takes us into very sensitive, disquieting territory concerning the nature of China's internet. It will be difficult for us to avoid assessing the quality of China's knowledge basis and information resources overall.

Here are the opening paragraphs of the MIT Technology Review article by Zeyi Yang:

Read the rest of this entry »

Permalink Comments (15)

AI based on Xi Jinping Thought

May 21, 2024 @ 9:23 am· Filed by Victor Mair under Artificial intelligence, Language and politics

It's hard to believe they're serious about this:

China rolls out large language model based on Xi Jinping Thought

Country’s top internet regulator promises ‘secure and reliable’ system that is not open-sourced
Model is still undergoing internal testing and is not yet available for public use

Sylvie Zhuang in Beijing
Published: 7:57pm, 21 May 2024

It's the antithesis of open-sourced, i.e., it's close-sourced. What are the implications of that for a vibrant, powerful system of thought?

Read the rest of this entry »

Permalink Comments (18)

Google AI Overview has a ways to go

May 19, 2024 @ 9:55 am· Filed by Mark Liberman under Artificial intelligence

…or maybe I should say, "is deeply stupid, so far".

At least, that's the verdict from my first encounter with this heralded innovation.

I updated a Chromebook, re-installed Linux, and thought (incorrectly) that I might need to add repositories in order to install some non-standard apps like R and Octave and Emacs. (Never mind if that's all opaque to you — AI supposedly knows its way around basic tech stuff…)

So I googled "how to install R in linux on a chromebook", and got this: