Archive for Elephant semifics

Kabbalist NLP

Oscar Schwartz, "Natural Language Processing Dates Back to Kabbalist Mystics", IEEE Spectrum 10/28/2019 ("Long before NLP became a hot field in AI, people devised rules and machines to manipulate language"):

The story begins in medieval Spain. In the late 1200s, a Jewish mystic by the name of Abraham Abulafia sat down at a table in his small house in Barcelona, picked up a quill, dipped it in ink, and began combining the letters of the Hebrew alphabet in strange and seemingly random ways. Aleph with Bet, Bet with Gimmel, Gimmel with Aleph and Bet, and so on.

Abulafia called this practice "the science of the combination of letters." He wasn't actually combining letters at random; instead he was carefully following a secret set of rules that he had devised while studying an ancient Kabbalistic text called the Sefer Yetsirah. This book describes how God created "all that is formed and all that is spoken" by combining Hebrew letters according to sacred formulas. In one section, God exhausts all possible two-letter combinations of the 22 Hebrew letters.

By studying the Sefer Yetsirah, Abulafia gained the insight that linguistic symbols can be manipulated with formal rules in order to create new, interesting, insightful sentences. To this end, he spent months generating thousands of combinations of the 22 letters of the Hebrew alphabet and eventually emerged with a series of books that he claimed were endowed with prophetic wisdom.

Comments (6)

TO THE CONTRARYGE OF THE AND THENESS

Yiming Wang et al., "Espresso: A fast end-to-end neural speech recognition toolkit", ASRU 2019:

We present ESPRESSO, an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning library PyTorch and the popular neural machine translation toolkit FAIRSEQ. ESPRESSO supports distributed training across GPUs and computing nodes, and features various decoding approaches commonly employed in ASR, including look-ahead word-based language model fusion, for which a fast, parallelized decoder is implemented. ESPRESSO achieves state-of-the-art ASR performance on the WSJ, LibriSpeech, and Switchboard data sets among other end-to-end systems without data augmentation, and is 4–11× faster for decoding than similar systems (e.g. ESPNET)

Read the rest of this entry »

Comments (13)

The Iron Law of AI

Today's SMBC:

Mouseover title: "The other day I was really freaked out that a computer could generate faces of people who DON'T REALLY EXIST, only to later realize painters have been doing this for several millenia."

Read the rest of this entry »

Comments (8)

"Unparalleled accuracy" == "Freud as a scrub woman"

A couple of years ago, in connection with the JSALT2017 summer workshop, I tried several commercial speech-to-text APIs on some clinical recordings, with very poor results. Recently I thought I'd try again, to see how things have progressed. After all, there have been recent claims of "human parity" in various speech-to-text applications, and (for example) Google's Cloud Speech-to-Text tells us that it will "Apply the most advanced deep-learning neural network algorithms to audio for speech recognition with unparalleled accuracy", and that "Cloud Speech-to-Text accuracy improves over time as Google improves the internal speech recognition technology used by Google products."

So I picked one of the better-quality recordings of neuropsychological test sessions that we analyzed during that 2017 workshop, and tried a few segments. Executive summary: general human parity in automatic speech-to-text is still a ways off, at least for inputs like these.

Read the rest of this entry »

Comments (8)

Deep learning stumbles again

At least I think that's what happened here. Gita Jackson, "Tumblr's New Algorithm Thinks Garfield Is Explicit Content", Kotaku 12/4/2018:

Yesterday, Tumblr announced that it will ban all adult content starting December 17th. As users logged into their accounts, they have seen that some of their posts now have a red banner across them, marking them as flagged for explicit content. The problem is, a lot of these posts are hilariously far from being pornographic.

It's pretty clear that these flags are being done based on an algorithm, and the algorithm is finding false positives. Here's a list of things that got flagged: a fully clothed woman, a drawing of a dragon, fan-art of of characters from the anime Haikyu!!, art from the children's book The Princess Who Saved Herself that the author of said book posted, a drawing of a bowl of fruit with mouths, a video of abstract blurs, Garfield.

Read the rest of this entry »

Comments (41)

Today's Google Translate poetry

Just checking to see that Google Translate is still into hallucinatory automatic writing

Today's input is five random hiragana characters — あっぉぉを — repeated various numbers of times:

 あっぉぉを Oh yeah
2X I am afraid that
3X We have an Omote
4X We will hold an Om to Oh no
5X We will send out a certain number of employees
6X We will send out a certain number of employees
to a certain number of employees
7X We will hold a certain number of employees
and one million yen
8X We do not want to be an omen
9X We will transfer a certain amount of money
to a certain number of employees
13X We did not wish to be a member of the company. Ah

 

Comments (9)

More Google Translate hallucinations on YouTube

1,237,159 views so far:

[Warning: Loud background music.]

Read the rest of this entry »

Comments (4)

Call it what?

Gráinne Ní Aodha, "German students say English exam that asked them to explain Brexit was unfair", The Journal (Dublin) 5/4/2018:

German students have complained that an English exam that asked them to discuss Brexit, among other things, was too difficult and "unfair".

Over 35,000 people have signed an online petition to voice their opposition to the challenging English paper, saying that the reading comprehensions and current affairs topics were unfair.

Christopher Schuetze, "Thousands of German Students Protest 'Unfair' English Exam", NYT 5/5/2018:

Complaining that your final school exams are too tough is a rite of passage — almost a tradition.

But German students in the southwestern state of Baden-Württemberg who hunkered down in April to take pivotal final secondary-school exams have gone a step further in their protests about the English-language portion of the test, which they said was absurd, with obscure and outdated references.

More coverage e.g. here.

Read the rest of this entry »

Comments (21)

Colossal translation fail at the Boao Forum for Asia

China is currently hosting the Boao Forum for Asia in Hainan, the smallest and southernmost province of the PRC.  The BFA bills itself as the "Asian Davos", after the World Economic Forum held annually in Davos, Switzerland.  The BFA draws representatives from many countries, so naturally they have to provide translation services.  Unfortunately, the machine translation system they used this year failed miserably.  Here are screenshots of a couple of examples:

Read the rest of this entry »

Comments (14)

AI triumph of the week

Posted to twitter by Ariel Waldman, with the comment "tell me again how AI will take over the world":

Read the rest of this entry »

Comments (6)

The architecture of speech

Or maybe it should be the sound pattern of architecture? Anyhow, Ariel Goldberg sends this interesting demonstration of the fact that Google Books still sometimes gets jiggy with its category choices:

Read the rest of this entry »

Comments (13)

AI hallucinations

Tom Simonite, "AI has a hallucination problem that's proving tough to fix", Wired 3/9/2018:

Tech companies are rushing to infuse everything with artificial intelligence, driven by big leaps in the power of machine learning software. But the deep-neural-network software fueling the excitement has a troubling weakness: Making subtle changes to images, text, or audio can fool these systems into perceiving things that aren't there.

Simonite's article is all about "adversarial attacks", where inputs are adjusted iteratively to hill-climb towards an impressively (or subversively) wrong result. But anyone who's been following the "Elephant semifics" topic on this blog knows that for Google's machine translation, at least, spectacular hallucinations can be triggered by shockingly simple inputs: random strings of vowels, the Vietnamese alphabet, repetitions of single hiragana characters, random Thai keyboard banging, etc.

Read the rest of this entry »

Comments (12)

Alexa laughs

Now that speech technology is good enough that voice interaction with devices is becoming widespread and routine, success has created a new problem: How should a device tell when to attend to ambient sounds and try to interpret them as questions or commands?

One solution is to require a mouse click or a finger press to start things off — but this can degrade the whole "ever-attentive servant" experience. So increasingly such systems rely on a key phrase like "Hey Siri" or "OK Google" or "Alexa". But this solution brings up other problems, since users don't like the idea of their entire life's soundtrack streaming to Apple or Google or Amazon. And anyhow, streaming everything to the Mother Ship might strain battery life and network bandwidth for some devices. The answer: Create simple, low-power device-local programs that do nothing but monitor ambient audio for the relevant magic phrase.

Problem: these programs aren't yet very good. Result: lots of false positives. Mostly the false positives are relatively benign — see e.g. "Annals of helpful surveillance", 5/9/2017. But recently, many people have been creeped out by Alexa laughing at them, apparently for no reason:

Read the rest of this entry »

Comments (21)