Language Log

Adversarial attacks on modern speech-to-text

January 30, 2018 @ 8:56 am · Filed by Max Little under Computational linguistics, Elephant semifics

Generating adversarial STT examples.

In a post on this blog recently Mark Liberman raised the lively area of so-called "adversarial" attacks for modern machine learning systems. These attacks can do amusing and somewhat frightening things such as force an object recognition algorithm to identify all images as toasters with remarkably high confidence. Seeing these applied to image recognition, he hypothesized they could also be applied to modern speech recognition (STT, or speech-to-text) based on e.g. deep learning. His hypothesis has indeed been recently confirmed.

Read the rest of this entry »

Permalink Comments (7)

"Voiceprint" springs eternal

January 29, 2018 @ 1:11 pm · Filed by Mark Liberman under Language and the law, Language and the media, Words words words

John R. Quain, "Alexa, What Happened to My Car?", NYT 1/25/2018 [emphasis added]:

And even though voice bots like Alexa and Google’s Assistant can be taught to recognize different voices — well enough to cater to each family member’s favored Pandora stations, for example — they do not offer any sort of biometric security, such as voice print analysis. As a result, Alexa’s voice-recognition capabilities are not discerning enough for security purposes, according to Amazon.

There are two things about this passage that caught my attention.

First, a minor point: the NYT here chooses to write "voice print" as two separate words. This is a change from their previous practice — already in May of 1962 (and many times since), the grey lady was writing "voiceprint" solid in stories like this one:

A researcher from Bell Telephone Laboratories described yesterday tests that he said, showed that "voiceprints" may prove to be almost as effective, for identification, as fingerprints.

And second, a more important point: here's a journalist who still thinks that "voice print analysis", however spelled, offers "biometric security".

[Warning: what follows is a long post about lexicographic, technological, journalistic, and literary history, guaranteeing that at least three quarters of the content will bore or mystify most readers.)

Read the rest of this entry »

Permalink Comments (9)

A productive-ass suffix

January 29, 2018 @ 11:05 am · Filed by Ben Zimmer under Humor, Morphology

Currently making the rounds is a video from Conan showing a standup appearance by the Finnish comedian Ismo Leikola. In his experience of learning English as a second language, he says, "I think the hardest word to truly master has been the word ass." He muses on the peculiar application of -ass as a slangy suffix in words like lazy-ass, long-ass, grown-ass, bad-ass, and dumb-ass.

Stan Carey discussed the video on the Strong Language blog ("A paradoxical-ass word"), and he links to Mark Liberman's 2014 roundup of scholarship on -ass (on Language Log and elsewhere), "Ignoble-ass citation practices."

Read the rest of this entry »

Permalink Comments (21)

Accentuate the negative

January 28, 2018 @ 5:41 pm · Filed by Geoffrey K. Pullum under Language teaching and learning, negation, Syntax

A curious case of a forced-choice sentence-completion question on a ninth-grade exam at a high school in Taiwan is briefly discussed on Lingua Franca today, for a very general non-linguist readership. It merits a slightly longer and more serious treatment, which I thought Language Log readers might appreciate. The exam question basically asks for a decision on the question of which one of these sentences is fully correct and which deserves to be called ungrammatical:

(a) Lydia knows few things, and so does Peter.
(b) Lydia knows few things, and neither does Peter.

Because continuation with neither does… is widely taken to be a test for negative polarity, this amounts to asking whether Lydia knows few things is a positive clause like Lydia knows everything or a negative one like Lydia doesn't knows anything. And a friend of mine in Taiwan reports having asked a number of English speakers, with a truly surprising result. He finds a split between the two great English dialect groups, the North American dialects (AmE) and the British and Australasian dialects (BrE). The AmE speakers that he asked all said (a) was correct, while the BrE speakers all said that (b) was correct.

Read the rest of this entry »

Permalink Comments (79)

Forcing Mandarin on Hong Kong

January 27, 2018 @ 3:08 pm · Filed by Victor Mair under Language and education, Language and politics, Topolects

According to the Sino-British Joint Declaration signed by the Prime Ministers of the People's Republic of China (PRC) and the United Kingdom (UK) governments on December 19, 1984, the way of life in Hong Kong would remain unchanged for a period of 50 years from the time of its handover to the PRC in 1997. This would have left Hong Kong unchanged until 2047. I never for a moment thought that China would adhere to this agreement, and we see in countless ways how basic rights, laws, and socio-political institutions have been changing radically since the handover in 1997, only twenty years ago. One of the most noticeable aspects of these changes has to do with language.

Cantonese is rapidly being pushed aside in favor of Mandarin, and this is not what the people of Hong Kong would have wanted to happen. The threat to Cantonese is manifested in many ways, such as more and more schools being required to provide classroom instruction in Mandarin instead of Cantonese.

Read the rest of this entry »

Permalink Comments (22)

Global drop in GNP?

January 26, 2018 @ 7:57 pm · Filed by Mark Liberman under Peeving

Is it my imagination, or has there been a drop in GNP (Gross National Peeving) across the Anglophone world? I'm not seeing nearly the volume of "Angry linguistic mobs with torches" that I (think I) did a decade ago.

So the recently viral story about this sign on the door of the Continental bar makes me kind of nostalgic:

East Village bar the Continental expounds on their (tongue-in-cheek) ban on the word literally. Their stated goal now is to stop “Kardashianism.” cc: @edenbrower pic.twitter.com/iI0N41qCgt

— evgrieve (@evgrieve) January 24, 2018

Read the rest of this entry »

Permalink Comments (23)

Biscriptal juxtaposition in Chinese, part 4

January 26, 2018 @ 3:36 pm · Filed by Victor Mair under Diglossia and digraphia, Language on the internets, Writing systems

Screenshot from Nikita Kuzmin's WeChat:

Read the rest of this entry »

Permalink Comments (2)

What it is is what it is

January 26, 2018 @ 2:42 pm · Filed by Mark Liberman under Semantics

Jay Livingston sends a compendium of tautologies from The Wire:

Read the rest of this entry »

Permalink Comments (15)

Learning not to avoid

January 26, 2018 @ 8:12 am · Filed by Mark Liberman under Misnegation

Joanna Klein, "Swatting at Mosquitoes May Help You Avoid Bites, Even if you Miss", NYT 1/25/2018:

If you keep swatting at a mosquito, will it leave you alone?

Some scientists think so. But it depends.

Some blood meals are worth a mosquito risking its life. But if there’s a more attractive or accepting alternative to feed from, a mosquito may move on to that someone or something instead.

An interesting story. But this is Language Log, not Insect Learning Log, so let's focus on the prominently-displayed picture caption, which reads:

A new study suggests that mosquitoes might learn not to avoid people who swat at them, by recognizing their smell.

Read the rest of this entry »

Permalink Comments (17)

Poetic dynamism

January 25, 2018 @ 4:41 am · Filed by Mark Liberman under Language and literature

Well, the dynamic range of the amplitude of syllables in poetry readings, anyhow:

What IS that?

Read the rest of this entry »

Permalink Comments (2)

Language vigilantism

January 24, 2018 @ 11:26 am · Filed by Victor Mair under Errors, Lexicon and lexicography, Morphology, Words words words, Writing systems

In "The Eagle-Eyed Vigilantes Defending the Chinese Language: As new lingo springs up and grammatical errors persist, one magazine is battling to maintain linguistic standards", Yin Yijun (Sixth Tone [1/19/18]) describes an unusual PRC journal:

Shanghai-based Yaowen Jiaozi — whose name literally translates as “biting phrases and chewing characters” — was established in 1995 and operates under the slogan: “Bite every mistake that deserves to be bitten, and chew every article worth chewing.” The monthly magazine’s mission is to attack every grammatical error it encounters — and the staff take the job seriously. Over the past 20 years, the magazine has amassed a long list of mistakes, from a nearly unnoticeable Chinese character error on a chopstick wrapper, to a series of mistakes author and Nobel laureate Mo Yan made in his award-winning works.

Read the rest of this entry »

Permalink Comments (13)

Indispensable condiment

January 22, 2018 @ 6:35 pm · Filed by Victor Mair under Found in translation, Language and food

Valerie Hansen gave me the following package:

Read the rest of this entry »

Permalink Comments (8)

Putin in Russian, Mandarin, and English

January 21, 2018 @ 8:12 pm · Filed by Victor Mair under Names, Transcription

I'm at Yale University attending a workshop on Tangut. So you ask, "What is 'Tangut'?" Relevant Wikipedia articles:

Tangut people, an ancient ethnic group in Northwest China, not Tibetan people.
Tangut language, the extinct language spoken by the Tangut people, not Tibetan language.
Tangut script, the writing system used to write the Tangut language
Western Xia (1038–1227), also known as the Tangut Empire, a state founded by the Tangut people

Enough of Tangut for now. I will write a separate post on Tangut language and script later on. Meanwhile, since the majority of specialists on Tangut are Russian, and several Russians are participating in this workshop, I've heard them refer to the president of their country with a pronunciation that is rather different from what we say it in English, but more nearly resembles the way his surname is spoken in Mandarin.

Read the rest of this entry »

Permalink Comments (10)

Language Log

Adversarial attacks on modern speech-to-text

"Voiceprint" springs eternal

A productive-ass suffix

Accentuate the negative

Forcing Mandarin on Hong Kong

Global drop in GNP?

Biscriptal juxtaposition in Chinese, part 4

What it is is what it is

Learning not to avoid

Poetic dynamism

Language vigilantism

Indispensable condiment

Putin in Russian, Mandarin, and English

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta