Archive for Computational linguistics

Trends in presidential pitch

I’ve been downloading the audio for Donald Trump’s Weekly Addresses from whitehouse.gov, as I did for George W. Bush and Barack Obama. And as I did for the previous presidents, I listen to the results and sometimes do simple acoustic-phonetic analyses — see e.g. “Raising his voice“, 10/8/2011; “Political sound and silence“, 2/8/2016. Recently I thought I noticed a significant change in Mr. Trump’s pitch range, and a quick check confirmed this impression.

Read the rest of this entry »

Comments (3)

Annals of helpful surveillance

Early one evening last week, I was feeling sleepy, and said so. And a little later, I said “OK, I’m cashing in my threat to take a nap”, and went into my bedroom to do so.

As usual, I took my cell phone out of my pocket and plugged it in to charge, which made the screen light up. On it I saw this:

Read the rest of this entry »

Comments (15)

Machine translation bug of the week

Comments (24)

More Deep Translation arcana

At Riddled, sometime LLOG commenter Smut Clyde has posted an impressive series of Goofle Translate experiments. You can read them at the links below — I’ve added locally-stored images, based on previous experience with bit rot as well as recent advice from James Angleton.

Mayor Snorkum will lay the cake” [Snorkum1]
Reveal to me the unknown tongue“: [UnknownTongue1, UnknownTongue2, UnknownTongue3, UnknownTongue4, UnknownTongue5]
Go home, Google Translate. You are drunk.”: [Lovecraft1, Lovecraft2, Lovecraft3]

Read the rest of this entry »

Comments (13)

Your gigantic crocodile!

One more piece of Google Translate poetry, contributed by Mackenzie Morris:


Read the rest of this entry »

Comments (24)

PR push for “Voice Stress Analysis” products?

A Craigslist ad posted 20 days ago — “Seeking a Blog Writer for Voice Stress Analysis Technology“:

We are looking for someone to ghostwrite blog posts and articles for a large company that specializes in computer-aided voice stress analysis technology or CVSA. We want you to primarily discuss the scientific research backing it up and the psychophysiological processes involved in implementing the technology. Basically, we want you to describe how it works, why it works, and why it is an effective technology, with everything backed up by scientific research and facts. […]

We are seeking a motivated, passionate, enthusiastic ghostwriter to craft blog articles ranging loosely from 750-900 words, that are valuable and informative to our target audience. Our audience for this client is law enforcement agencies, military, intelligence, immigration, and any other section of our government or private law practices that will be using investigative interviewing methods to screen subjects.

Read the rest of this entry »

Comments (6)

The sphere of the sphere is the sphere of the sphere

In a comment on “Electric Sheep“, Tim wrote:

Just want to share a little Google Translate poetry resulting from drumming my fingers on the keyboard while set to Thai:

There are six sparks in the sky, each with six spheres. The sphere of the sphere is the sphere of the sphere.

Read the rest of this entry »

Comments (13)

Electric sheep

A couple of recent LLOG posts (“What a tangled web they weave“, “A long short-term memory of Gertrude Stein“) have illustrated the strange and amusing results that Google’s current machine translation system can produce when fed variable numbers of repetitions of meaningless letter sequences in non-Latin orthographic systems. Geoff Pullum has urged me to explain how and why this sort of thing happens:

I think Language Log readers deserve a more careful account, preferably from your pen, of how this sort of craziness can arise from deep neural-net machine translation systems. […]

Ordinary people imagine (wrongly) that Google Translate is approximating the process we call translation. They think that the errors it makes are comparable to a human translator getting the wrong word (or the wrong sense) from a dictionary, or mistaking one syntactic construction for another, or missing an idiom, and thus making a well-intentioned but erroneous translation. The phenomena you have discussed reveal that something wildly, disastrously different is going on.  

Something nonlinear: 18 consecutive repetitions of a two-character Thai sequence produce “This is how it is supposed to be”, and so do 19, 20, 21, 22, 23, and 24, and then 25 repetitions produces something different, and 26 something different again, and so on. What will come out in response to a given input seems informally to be unpredictable (and I’ll bet it is recursively unsolvable, too; it’s highly reminiscent of Emil Post’s famous tag system where 0..X is replaced by X00 and 1..X is replaced by X1101, iteratively).

Type “La plume de ma tante est sur la table” into Google Translate and ask for an English translation, and you get something that might incline you, if asked whether you would agree to ride in a self-driving car programmed by the same people, to say yes. But look at the weird shit that comes from inputting Asian language repeated syllable sequences and you not only wouldn’t get in the car, you wouldn’t want to be in a parking lot where it was driving around on a test run. It’s the difference between what might look like a technology nearly ready for prime time and the chaotic behavior of an engineering abortion that should strike fear into the hearts of any rational human.  

Language Log needs at least a sketch of a proper serious account of what’s going on here.

A sketch is all that I have time for today, but here goes…

Read the rest of this entry »

Comments (38)

A long short-term memory of Gertrude Stein

As just observed (“What a tangled web they weave“), successive repetitions of short sequences of Japanese, Korean, Thai (and perhaps other types of) characters cause Google’s Neural Machine Translation system to generate surprisingly varied and poetic English equivalents.

Thus if we repeat 1 through 25 times the two-character Thai sequence ไๅ

|ไ| 0x0E44 “THAI CHARACTER SARA AI MAIMALAI”
|ๅ| 0x0E45 “THAI CHARACTER LAKKHANGYAO”

the system, “a deep LSTM network with 8 encoder and 8 decoder layers using attention, residual connections, and trans-temporal chthonic affinity”, establishes a pretty solid spiritual connection with Gertrude Stein:

Read the rest of this entry »

Comments (14)

What a tangled web they weave

Comments (31)

Country list translation oddity

This is weird, and even slightly creepy — paste a list of countries like

Costa Rica, Argentina, Belgium, Bulgaria, Canada, Chile, Colombia, Dominican Republic, Ecuador, El Salvador, Ethiopia, France, Germany, England, Guatemala, Honduras, Italy, Israel, Mexico, New Zealand, Nicaragua, Peru, Puerto Rico, Scotland, Switzerland, Spain, Sweden, Uruguay, Venezuela, USA

into Google Translate English-to-Spanish, and a parallel-universe list emerges:

Read the rest of this entry »

Comments (22)

Advances in birdsong modeling

Eve Armstrong and Henry Abarbanel, “Model of the songbird nucleus HVC as a network of central pattern generators“, Journal of neurophysiology, 2016:

We propose a functional architecture of the adult songbird nucleus HVC in which the core element is a “functional syllable unit” (FSU). In this model, HVC is organized into FSUs, each of which provides the basis for the production of one syllable in vocalization. Within each FSU, the inhibitory neuron population takes one of two operational states: (A) simultaneous firing wherein all inhibitory neurons fire simultaneously, and (B) competitive firing of the inhibitory neurons. Switching between these basic modes of activity is accomplished via changes in the synaptic strengths among the inhibitory neurons. The inhibitory neurons connect to excitatory projection neurons such that during state (A) the activity of projection neurons is suppressed, while during state (B) patterns of sequential firing of projection neurons can occur. The latter state is stabilized by feedback from the projection to the inhibitory neurons. Song composition for specific species is distinguished by the manner in which different FSUs are functionally connected to each other.

Ours is a computational model built with biophysically based neurons. We illustrate that many observations of HVC activity are explained by the dynamics of the proposed population of FSUs, and we identify aspects of the model that are currently testable experimentally. In addition, and standing apart from the core features of an FSU, we propose that the transition between modes may be governed by the biophysical mechanism of neuromodulation.

Read the rest of this entry »

Comments off

“Bare-handed speech synthesis”

This is neat: “Pink Trombone“, by Neil Thapen.

By the same author — doodal:

Comments (4)