Archive for Elephant semifics

Call it what?

Gráinne Ní Aodha, "German students say English exam that asked them to explain Brexit was unfair", The Journal (Dublin) 5/4/2018:

German students have complained that an English exam that asked them to discuss Brexit, among other things, was too difficult and “unfair”.

Over 35,000 people have signed an online petition to voice their opposition to the challenging English paper, saying that the reading comprehensions and current affairs topics were unfair.

Christopher Schuetze, "Thousands of German Students Protest ‘Unfair’ English Exam", NYT 5/5/2018:

Complaining that your final school exams are too tough is a rite of passage — almost a tradition.

But German students in the southwestern state of Baden-Württemberg who hunkered down in April to take pivotal final secondary-school exams have gone a step further in their protests about the English-language portion of the test, which they said was absurd, with obscure and outdated references.

More coverage e.g. here.

Read the rest of this entry »

Comments (21)

Colossal translation fail at the Boao Forum for Asia

China is currently hosting the Boao Forum for Asia in Hainan, the smallest and southernmost province of the PRC.  The BFA bills itself as the "Asian Davos", after the World Economic Forum held annually in Davos, Switzerland.  The BFA draws representatives from many countries, so naturally they have to provide translation services.  Unfortunately, the machine translation system they used this year failed miserably.  Here are screenshots of a couple of examples:

Read the rest of this entry »

Comments (14)

AI triumph of the week

Posted to twitter by Ariel Waldman, with the comment "tell me again how AI will take over the world":

Read the rest of this entry »

Comments (6)

The architecture of speech

Or maybe it should be the sound pattern of architecture? Anyhow, Ariel Goldberg sends this interesting demonstration of the fact that Google Books still sometimes gets jiggy with its category choices:

Read the rest of this entry »

Comments (13)

AI hallucinations

Tom Simonite, "AI has a hallucination problem that's proving tough to fix", Wired 3/9/2018:

Tech companies are rushing to infuse everything with artificial intelligence, driven by big leaps in the power of machine learning software. But the deep-neural-network software fueling the excitement has a troubling weakness: Making subtle changes to images, text, or audio can fool these systems into perceiving things that aren’t there.

Simonite's article is all about "adversarial attacks", where inputs are adjusted iteratively to hill-climb towards an impressively (or subversively) wrong result. But anyone who's been following the "Elephant semifics" topic on this blog knows that for Google's machine translation, at least, spectacular hallucinations can be triggered by shockingly simple inputs: random strings of vowels, the Vietnamese alphabet, repetitions of single hiragana characters, random Thai keyboard banging, etc.

Read the rest of this entry »

Comments (12)

Alexa laughs

Now that speech technology is good enough that voice interaction with devices is becoming widespread and routine, success has created a new problem: How should a device tell when to attend to ambient sounds and try to interpret them as questions or commands?

One solution is to require a mouse click or a finger press to start things off — but this can degrade the whole "ever-attentive servant" experience. So increasingly such systems rely on a key phrase like "Hey Siri" or "OK Google" or "Alexa". But this solution brings up other problems, since users don't like the idea of their entire life's soundtrack streaming to Apple or Google or Amazon. And anyhow, streaming everything to the Mother Ship might strain battery life and network bandwidth for some devices. The answer: Create simple, low-power device-local programs that do nothing but monitor ambient audio for the relevant magic phrase.

Problem: these programs aren't yet very good. Result: lots of false positives. Mostly the false positives are relatively benign — see e.g. "Annals of helpful surveillance", 5/9/2017. But recently, many people have been creeped out by Alexa laughing at them, apparently for no reason:

Read the rest of this entry »

Comments (21)

o ai aaa oa ueui

As ktschwarz pointed out in the comments on yesterday's post "Easy going crazy", Google Translate is disposed to recognize text consisting only of vowels and spaces as Hawaiian, and to hallucinate a coherent if sometimes chilling translation into English.

In order to exercise this option more fully, I wrote and tested a simple R script to generate random messages of this type:

 N = 150
 Letters = c("a","e","i","o","u"," ")
 cat(sprintf("%s\n",paste0(sample(Letters,N,replace=TRUE),collapse="")))

So for example:


Read the rest of this entry »

Comments (21)

Easy going crazy

Today Josh Tenenbaum gave a talk here in the Interdisciplinary Mind and Brain Seminar Series, under the title "On what you can’t learn from (merely) all the data in the world, and what else is needed". One of his themes was that current RNN systems lack common sense, and so in honor of that point, here's another episode in our ongoing Elephant Semifics series. This one is based on repetitions of  0x306C "HIRAGANA LETTER NU", which Google Translate correctly diagnoses as Japanese.

Read the rest of this entry »

Comments (18)

Adversarial attacks on modern speech-to-text

Generating adversarial STT examples.

In a post on this blog recently Mark Liberman raised the lively area of so-called "adversarial" attacks for modern machine learning systems. These attacks can do amusing and somewhat frightening things such as force an object recognition algorithm to identify all images as toasters with remarkably high confidence. Seeing these applied to image recognition, he hypothesized they could also be applied to modern speech recognition (STT, or speech-to-text) based on e.g. deep learning. His hypothesis has indeed been recently confirmed.

Read the rest of this entry »

Comments (7)

"Notes to the financial statements"

From Jenny Chu:

You might be amused by this latest edition of Google Translate's ability to transform meaningless character sequences into spoken-word poetry, discovered by my young son.

It is all of the Vietnamese characters, in order of their appearance on the character map, with no spaces. Moreover, if you add all of the other non-diacritic characters on the keyboard, you get "The following is a brief description of each of the available options."

Read the rest of this entry »

Comments (8)

You need to know something

I'm happy to see that Google Translate is still turning (many types of) meaningless character sequences into spoken-word poetry. Repetitions of single hiragana characters are an especially reliable source — here's "You need to know something":


Read the rest of this entry »

Comments (15)

Elephant semifics

Comments (11)

"balls have zero to me to me to me to me to me to me to me to me to"

Adrienne LaFrance, "What an AI's Non-Human Language Actually Looks Like", The Atlantic 6/20/2017:

Something unexpected happened recently at the Facebook Artificial Intelligence Research lab. Researchers who had been training bots to negotiate with one another realized that the bots, left to their own devices, started communicating in a non-human language.  […]

Read the rest of this entry »

Comments (7)