Archive for Computational linguistics

Adversarial attacks on modern speech-to-text

Generating adversarial STT examples.

In a post on this blog recently Mark Liberman raised the lively area of so-called "adversarial" attacks for modern machine learning systems. These attacks can do amusing and somewhat frightening things such as force an object recognition algorithm to identify all images as toasters with remarkably high confidence. Seeing these applied to image recognition, he hypothesized they could also be applied to modern speech recognition (STT, or speech-to-text) based on e.g. deep learning. His hypothesis has indeed been recently confirmed.

Read the rest of this entry »

Comments (7)

Ross Macdonald: lexical diversity over the lifespan

This post is an initial progress report on some joint work with Mark Liberman. It's part of a larger effort to replicate and extend Xuan Le, Ian Lancashire, Graeme Hirst, & Regina Jokel, "Longitudinal detection of dementia through lexical and syntactic changes in writing: a case study of three British novelists", Literary and Linguistic Computing 2011. Their abstract:

We present a large-scale longitudinal study of lexical and syntactic changes in language in Alzheimer's disease using complete, fully parsed texts and a large number of measures, using as our subjects the British novelists Iris Murdoch (who died with Alzheimer's), Agatha Christie (who was suspected of it), and P.D. James (who has aged healthily). […] Our results support the hypothesis that signs of dementia can be found in diachronic analyses of patients’ writings, and in addition lead to new understanding of the work of the individual authors whom we studied. In particular, we show that it is probable that Agatha Christie indeed suffered from the onset of Alzheimer's while writing her last novels, and that Iris Murdoch exhibited a ‘trough’ of relatively impoverished vocabulary and syntax in her writing in her late 40s and 50s that presaged her later dementia.

Read the rest of this entry »

Comments (11)

News program presenter meets robot avatar

Yesterday BBC's Radio 4 program "Today", the cultural counterpart of NPR's "Morning Edition", invited into the studio a robot from the University of Sheffield, Mishal Husain and the Mishalbot the Mishalbot, which had been trained to conduct interviews by exposure to the on-air speech of co-presenter Mishal Husain. They let it talk for three minutes with the real Mishal. (video clip here, at least for UK readers; may not be available in the US). Once again I was appalled at the credulity of journalists when confronted with AI. Despite all the evidence that the robot was just parroting Mishalesque phrases, Ms Husain continued with the absurd charade, pretending politely that her robotic alter ego was really conversing. Afterward there was half-serious on-air discussion of the possibility that some day the jobs of the Today program presenters and interviewers might be taken over by robots.

The main thing differentiating the Sheffield robot from Joseph Weizenbaum's ELIZA program of 1966 (apart from a babyish plastic face and movable fingers and eyes, which didn't work well on radio) was that the Mishalbot is voice-driven (with ELIZA you had to type on a terminal). So the main technological development has been in speech recognition engineering. On interaction, the Mishalbot seemed to me to be at sub-ELIZA level. "What do you mean? Can you give an example?" it said repeatedly, at various inappropriate points.

Read the rest of this entry »

Comments off

A virus that fixes your grammar

In today's Dilbert strip, Dilbert is confused by why the company mission statement looks so different, and Alice diagnoses what's happened: the Elbonian virus that has been corrupting the company's computer systems has fixed all the grammar and punctuation errors it formerly contained.

That'll be the day. Right now, computational linguists with an unlimited budget (and unlimited help from Elbonian programmers) would be unable to develop a trustworthy program that could proactively fix grammar and punctuation errors in written English prose. We simply don't know enough. The "grammar checking" programs built into word processors like Microsoft Word are dire, even risible, catching only a limited list of shibboleths and being wrong about many of them. Flagging split infinitives, passives, and random colloquialisms as if they were all errors is not much help to you, especially when many sequences are flagged falsely. Following all of Word's suggestions for changes would creat gibberish. Free-standing tools like Grammarly are similarly hopeless. They merely read and note possible "errors", leaving you to make corrections. They couldn't possibly be modified into programs that would proactively correct your prose. Take the editing error in this passage, which Rodney Huddleston recently noticed in a quality newspaper, The Australian:

There has been no glimmer of light from the Palestinian Authority since the Oslo Accords were signed, just the usual intransigence that even the wider Arab world may be tiring of. Yet the West, the EU, nor the UN, have never made the PA pay a price for its intransigence.

Read the rest of this entry »

Comments off

Woo

Read the rest of this entry »

Comments (10)

Linguistic Science and Technology in China

I just spent a few days in China, mainly to attend an "International Workshop on Language Resource Construction: Theory, Methodology and Applications". This was the second event in a three-year program funded by a small grant from the "Penn China Research & Engagement Fund". That program's goals include "To develop new, or strengthen existing, institutional and faculty-to-faculty relationships with Chinese partners", and our proposal focused on "linguistic diversity in China, with specific emphasis on the documentation of variation in standard, regional and minority languages".

After last year's workshop at the Penn Wharton China Center, some Chinese colleagues (Zhifang Sui and Weidong Zhan from the Key Laboratory of Computational Linguistics and the Center for Chinese Linguistics at Peking University) suggested that we join them in co-sponsoring a two-day workshop this fall, with the first day at PKU and the second day at the PWCC. Here's the group photo from the first day (11/5/2017):

The growing strength of Chinese research in the various areas of linguistic science and technology has been clear for some time, and the presentations and discussions at this workshop made it clear that this work is poised for a further major increase in quantity and quality.

Read the rest of this entry »

Comments (11)

You need to know something

I'm happy to see that Google Translate is still turning (many types of) meaningless character sequences into spoken-word poetry. Repetitions of single hiragana characters are an especially reliable source — here's "You need to know something":


Read the rest of this entry »

Comments (15)

Cartoonist walks into a language lab…

Bob Mankoff gave a talk here in Madison not long ago.  You may recognize Mankoff as the cartoon editor for many years at the New Yorker magazine, who is now at Esquire. Mankoff’s job involved scanning about a thousand cartoons a week to find 15 or so to publish per issue. He did this for over 20 years, which is a lot of cartoons. More than 950 of his own appeared in the magazine as well. Mankoff has thought a lot about humor in general and cartoon humor in particular, and likes to talk and write about it too.

The Ted Talk
On “60 Minutes”
His Google talk
Documentary, "Very Semi-Serious"

What’s the Language Log connection?  Humor often involves language? New Yorker cartoons are usually captioned these days, with fewer in the lovely mute style of a William Steig.  A general theory of language use should be able to explain how cartoon captions, a genre of text, are understood. The cartoons illustrate (sic) the dependence of language comprehension on context (the one created by the drawing) and background knowledge (about, for example, rats running mazes, guys marooned on islands, St. Peter’s gate, corporate culture, New Yorkers). The popular Caption Contest is an image-labeling task, generating humorous labels for an incongruous scene.

But it’s Mankoff's excursions into research that are particularly interesting and Language Loggy.  Mankoff is the leading figure in Cartoon Science (CartSci), the application of modern research methods to questions about the generation, selection, and evaluation of New Yorker cartoons.

Read the rest of this entry »

Comments (11)

DolphinAttack

Guoming Zhang et al., "DolphinAttack: Inaudible Voice Commands", arXiv 8/31/2017:

In this work, we design a completely inaudible attack, DolphinAttack, that modulates voice commands on ultrasonic carriers (e.g., f > 20 kHz) to achieve inaudibility. By leveraging the nonlinearity of the microphone circuits, the modulated lowfrequency audio commands can be successfully demodulated, recovered, and more importantly interpreted by the speech recognition systems. We validate DolphinAttack on popular speech recognition systems, including Siri, Google Now, Samsung S Voice, Huawei HiVoice, Cortana and Alexa.

Read the rest of this entry »

Comments (11)

The power and the lactulose

The so-called Free Speech Rally that's about to start in Boston will probably be better attended, both by supporters and opponents, than the one that was organized by same group back in May. But some of the featured speakers at the May rally, including "Augustus Invictus", have decided not to attend today's rerun. So I listened to the YouTube copy of the May rally speech by Austin Gillespie (Augustus's real or at least original name). And since this is Language Log and not Political Rhetoric Log (though surely political rhetoric is part of language), I'm going to focus on YouTube's efforts to provide "automatic captions".

Read the rest of this entry »

Comments (3)

English Verb-Particle Constructions

Lately I've been thinking about "optionality" as it relates to syntactic alternations. (In)famous cases include complementizer deletion ("I know that he is here" vs. "I know he is here") or embedded V2 in Scandinavian. For now let's consider the English verb-particle construction. The relative order of the particle and the object is "optional" in cases such as the following:

1a) "John picked up the book"
1b) "John picked the book up"

Either order is usually acceptable (with the exception of pronoun objects — although those too become acceptable under a focus reading…)

1c) "John put it back"
1d) *"John put back it"

Read the rest of this entry »

Comments (20)

Gender, conversation, and significance

As I mentioned last month ("My summer", 6/22/2017), I'm spending six weeks in Pittsburgh at the at the 2017 Jelinek Summer Workshop on Speech and Language Technology (JSALT) , as part of a group whose theme is "Enhancement and Analysis of Conversational Speech".

One of the things that I've been exploring is simple models of who talks when — a sort of Biggish Data reprise of Sacks, Schegloff & Jefferson "A simplest systematics for the organization of turn-taking for conversation", Language 1974. A simple place to start is just the distribution of speech segment durations. And my first explorations of this first issue turned up a case that's relevant to yesterday's discussion of "significance".

Read the rest of this entry »

Comments (10)

Helpful Google

The marvels of modern natural language processing:

Michael Glazer, who sent in the example, wonders whether Google Translate has overdosed on old Boris and Natasha segments from Rocky and Bullwinkle:


Read the rest of this entry »

Comments (12)