Archive for Computational linguistics

Country list translation oddity

This is weird, and even slightly creepy — paste a list of countries like

Costa Rica, Argentina, Belgium, Bulgaria, Canada, Chile, Colombia, Dominican Republic, Ecuador, El Salvador, Ethiopia, France, Germany, England, Guatemala, Honduras, Italy, Israel, Mexico, New Zealand, Nicaragua, Peru, Puerto Rico, Scotland, Switzerland, Spain, Sweden, Uruguay, Venezuela, USA

into Google Translate English-to-Spanish, and a parallel-universe list emerges:

Read the rest of this entry »

Comments (22)

Advances in birdsong modeling

Eve Armstrong and Henry Abarbanel, "Model of the songbird nucleus HVC as a network of central pattern generators", Journal of neurophysiology, 2016:

We propose a functional architecture of the adult songbird nucleus HVC in which the core element is a "functional syllable unit" (FSU). In this model, HVC is organized into FSUs, each of which provides the basis for the production of one syllable in vocalization. Within each FSU, the inhibitory neuron population takes one of two operational states: (A) simultaneous firing wherein all inhibitory neurons fire simultaneously, and (B) competitive firing of the inhibitory neurons. Switching between these basic modes of activity is accomplished via changes in the synaptic strengths among the inhibitory neurons. The inhibitory neurons connect to excitatory projection neurons such that during state (A) the activity of projection neurons is suppressed, while during state (B) patterns of sequential firing of projection neurons can occur. The latter state is stabilized by feedback from the projection to the inhibitory neurons. Song composition for specific species is distinguished by the manner in which different FSUs are functionally connected to each other.

Ours is a computational model built with biophysically based neurons. We illustrate that many observations of HVC activity are explained by the dynamics of the proposed population of FSUs, and we identify aspects of the model that are currently testable experimentally. In addition, and standing apart from the core features of an FSU, we propose that the transition between modes may be governed by the biophysical mechanism of neuromodulation.

Read the rest of this entry »

Comments off

"Bare-handed speech synthesis"

This is neat: "Pink Trombone", by Neil Thapen.

By the same author — doodal:

Comments (4)

Court fight over Oxford commas and asyndetic lists

Language Log often weighs in when courts try to nail down the meaning of a statute. Laws are written in natural language—though one might long, by formalization, to end the thousand natural ambiguities that text is heir to—and thus judges are forced to play linguist.

Happily, this week's "case in the news" is one where the lawyers managed to identify several relevant considerations and bring them to the judges for weighing.

Most news outlets reported the case as being about the Oxford comma (or serial comma)—the optional comma just before the end of a list. Here, for example, is the New York Times:

Read the rest of this entry »

Comments (20)

What's hot at ICASSP

This week I'm at IEEE ICASSP 2017 in New Orleans — that's the "Institute of Electrical and Electronics Engineers International Conference on Acoustics, Speech and Signal Processing". pronounced /aɪ 'trɪ.pl i 'aɪ.kæsp/. I've had joint papers at all the ICASSP conferences since 2010, though I'm not sure that I've attended all of them.

This year the conference distributed its proceedings on a nifty little guitar-shaped USB key, which I promptly copied to my laptop for easier access. I seem to have deleted my local copies of most of the previous proceedings, but ICASSP 2014 escaped the reaper, so I decided to while away the time during one of the many parallel sessions here by running all the .pdfs (1703 in 2014, 1316 this year) through pdftotext, removing the REFERENCE sections, tokenizing the result, removing (some of the) unwordlike strings, and creating overall lexical histograms for comparison. The result is about 5 million words for 2014 and about 3.9 million words this year.

And to compare the lists, I used the usual "weighted log-odds-ratio, informative Dirichlet prior" method, as described for example in "The most Trumpish (and Bushish) words", 9/5/2015.

Read the rest of this entry »

Comments (2)

The shape of a LibriVox phrase

Here's what you get if you align 11 million words of English-language audiobooks with the associated texts, divide it all into phrases by breaking at silent pauses greater than 150 milliseconds, and average the word durations by position in phrases of lengths from one word to fifteen words:

The audiobook sample in this case comes from LibriSpeech (see Vassil Panayotov et al., "Librispeech: An ASR corpus based on public domain audio books", IEEE ICASSP 2015). Neville Ryant and I have been collecting and analyzing a variety of large-scale speech datasets (see e.g. "Large-scale analysis of Spanish /s/-lenition using audiobooks", ICA 2016; "Automatic Analysis of Phonetic Speech Style Dimensions", Interspeech 2016), and as part of that process, we've refactored and realigned the LibriSpeech sample, resulting in 5,832 English-language audiobook chapters from 2,484 readers, comprising 11,152,378 words of text and about 1,571 hours of audio. (This is a small percentage of the English-language data available from LibriVox, which is somewhere north of 50,000 hours of English audiobook at present.)

Read the rest of this entry »

Comments (8)

Quantifying Donald Trump's rhetoric

David Beaver & Jason Stanley, "Unlike all previous U.S. presidents, Trump almost never mentions democratic ideals", Washington Post 2/7/2017:

The central norms of liberal democratic societies are liberty, justice, truth, public goods and tolerance. To our knowledge, no one has proposed a metric by which to judge a politician’s commitment to these democratic ideals.

A direct way suggested itself to us: Why not simply add up the number of times those words and their synonyms are deployed? If the database is large enough, this should provide a rough measure of a politician’s commitment to these ideals. How does Trump’s use of these words compare to that of his presidential predecessors?

At Language Log, the linguist Mark Liberman graphed how unusual Trump’s inaugural speech was, graphing the frequency of critical words used in each of the past 50 years’ inaugural speeches — and showing how much more nationalist language, and how much less democratic language Trump used than did his predecessors.

We expanded this project, looking at the language in Trump’s inaugural address as well as in 61 campaign speeches since 2015. We compared that to the language used in all 57 prior inaugural speeches, from George Washington’s on. The comparison gives us a picture of Trump’s rhetorical emphases since his campaign began, and hence of his most deeply held political ideals.

Comments (9)

"Finding a voice"

An excellent article by Lane Greene: "Language: Finding a voice", The Economist 1/5/2017.

 

Comments (10)

Twitter-based word mapper is your new favorite toy

At the beginning of 2016, Jack Grieve shared the first iteration of the Word Mapper app he had developed with Andrea Nini and Diansheng Guo, which let users map the relative frequencies of the 10,000 most common words in a big Twitter-based corpus covering the contiguous United States. (See: "Geolexicography," "Totally Word Mapper.") Now as the year comes to a close, Quartz is hosting a bigger, better version of the app, now including 97,246 words (all occurring at least 500 times in the corpus). It's appropriately dubbed "The great American word mapper," and it's hella fun (or wicked fun, if you prefer).

Read the rest of this entry »

Comments (21)

"The people that stayed back the facts"

This is a reality check on the current state of automatic speech recognition (ASR) algorithms. I took the 186-word passage by Scottie Nell Hughes discussed in yesterday's post, and submitted it to two different Big-Company ASR interfaces, with amusing results. I'll be interested to see whether other systems can do better.

Read the rest of this entry »

Comments (6)

Human parity in conversational speech recognition

Today at ISCSLP2016, Xuedong Huang announced a striking result from Microsoft Research. A paper documenting it is up on arXiv.org — W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, G. Zweig, "Achieving Human Parity in Conversational Speech Recognition":

Conversational speech recognition has served as a flagship speech recognition task since the release of the DARPA Switchboard corpus in the 1990s. In this paper, we measure the human error rate on the widely used NIST 2000 test set, and find that our latest automated system has reached human parity. The error rate of professional transcriptionists is 5.9% for the Switchboard portion of the data, in which newly acquainted pairs of people discuss an assigned topic, and 11.3% for the CallHome portion where friends and family members have open-ended conversations. In both cases, our automated system establishes a new state-of-the-art, and edges past the human benchmark. This marks the first time that human parity has been reported for conversational speech. The key to our system's performance is the systematic use of convolutional and LSTM neural networks, combined with a novel spatial smoothing method and lattice-free MMI acoustic training.

Read the rest of this entry »

Comments (12)

The possessive Jesus of composition

Let me explain, very informally, what a predictive text imitator is. It is a computer program that takes as input a passage of training text and produces as output a new text that is composed quasi-randomly except that it matches the training text with regard to the frequencies of word or character sequences up to some fixed finite length k.

(There has to be such a length limit, of course: the only text in which the word sequence of Melville's Moby-Dick is matched perfectly is Melville's Moby-Dick, but what a predictive text imitator trained on Moby-Dick would do is to produce quasi-random fake-Moby-Dickish gibberish in which each sequence of not more than k units matches Moby-Dick with respect to the transition probabilities between adjacent units.)

I tell you this because a couple of months ago Jamie Brew made a predictive text imitator and trained it on my least favorite book in the world, William Strunk's The Elements of Style (1918). He then set it to work writing the first ten sections of a new quasi-randomly generated book. You can see the results here. The first point at which I broke down and laughed till there were tears in my eyes was at the section heading 'The Possessive Jesus of Composition and Publication'. But there were other such points too. Take a look at it. And trust me: following the advice in Jamie Brew's version of the book won't do your writing much more harm than following the original.

Read the rest of this entry »

Comments off

Clueless Microsoft language processing

A rather poetic and imaginative abstract I received in my email this morning (it's about a talk on computational aids for composers), contains the following sentence:

We will metaphorically drop in on Wolfgang composing at home in the morning, at an orchestra rehearsal in the afternoon, and find him unwinding in the evening playing a spot of the new game Piano Hero which is (in my fictional narrative) all the rage in the Viennese coffee shops.

There's nothing wrong with the sentence. What makes me bring it to your notice is the extraordinary modification that my Microsoft mail system performed on it. I wonder if you can see the part of the message that it felt it should mess with, in a vain and unwanted effort at helping me do my job more efficiently?

Read the rest of this entry »

Comments off