My posts will be erratic over the next couple of weeks

« previous post | next post »

Erratic in time, anyhow — maybe no more erratic than usual in terms of content.

My first stop will be Florence for  ICASSP 2014, where I'm involved in three papers whose details won't interest most of you. But here they are anyhow:

Andreas Stolcke, Neville Ryant, Vikramjit Mitra, Jiahong Yuan, Wen Wang, and Mark Liberman, "Highly Accurate Phonetic Segmentation Using Boundary Correction Models and System Fusion".

Jiahong Yuan, Neville Ryant, and Mark Liberman, "Automatic Phonetic Segmentation in Mandarin Chinese: Boundary Models, Glottal Features and Tone".

 Neville Ryant, Jiahong Yuan, and Mark Liberman, "Mandarin Tone Classification Without Pitch Tracking".

The general idea behind this work is the automation of phonetic transcription and measurement.  The goal is to make it possible to use very large available collections of digital audio in phonetics research – you could call it the Robot Phonetician Project.

After ICASSP, I'm going to London, for a panel at the British Academy on "Language, Linguistics, and the Data Explosion".  I won't be talking about the Robot Phonetician, at least not in any detail, but there's obviously a connection. And here's a draft blog post about my contribution to the panel. This was solicited by the editor of a section of the Guardian online ("The case for language learning"), which is apparently part of the BA's partnership with the Guardian. Neither the panel nor my post has much to say about language learning, but I've offered to add some relevant observations. I think that some version of this will appear later in the week.

Then comes a General Linguistics Seminar at Oxford, where I'll try to make some linguistic sense out of the "Tone without Pitch" results. (Some additional experiments in that line will be presented at Speech Prosody 2014: Neville Ryant, Malcolm Slaney, Mark Liberman, Elizabeth Shriberg, and Jiahong Yuan, "Highly Accurate Mandarin Tone Classification In The Absence of Pitch Information".) Here's the abstract for my  talk:

A "deep neural network" classifier, applied to a diverse corpus of new broadcasts, achieved the  best performance ever recorded on the task of recognizing Mandarin tones. Oddly, the classifier  accomplished this using generic spectral inputs that do not encode fundamental frequency (F0) in any obvious way. The same classifier had much worse performance with amplitude and F0 estimates as input; and adding F0 to the generic spectral inputs degraded performance slightly.

After various less interesting ideas have been considered and rejected, we offer three increasingly general (and speculative) explanations:

(1) The psychological dimension of pitch involves more than F0;
(2) The phonetic dimension of tone involves more than pitch;
(3) The reflection of phonological categories in articulation and sound is more complex than linguists generally assume.

Implications for phonological, phonetic, and sociolinguistic research will be discussed.

[Joint work with Neville Ryant, Jiahong Yuan, Malcolm Slaney, and Elizabeth Shriberg]

Share:



6 Comments »

  1. chris said,

    May 4, 2014 @ 1:49 pm

    Has anyone ever autotuned Mandarin (or any other form of Chinese for that matter), and if so, does it make it noticeably harder to understand (to people who can understand it normally)?

  2. Chris said,

    May 4, 2014 @ 3:09 pm

    Any plans to attend ACL in June (especially since it's in Bawlmore this year, just a short drive from Philadelphia)?

  3. Adam Funk said,

    May 4, 2014 @ 3:22 pm

    Hey Mark, I like your erratic content. I also think it's pretty cool that someone can do a lot of NLP (unfortunately, I didn't get a chance to meet you at LREC a few years ago) but also phonetic stuff (I find the latter interesting on the side but am not good at it).

  4. blahedo said,

    May 4, 2014 @ 9:36 pm

    The tone talk sounds really nifty, and like chris, I have wondered how language tone interacts with singing—not just autotuned singing, but singing in general—and it always seemed like there had to be more to it than just pitch. The usual answer I get, that understanding any sung Mandarin just involves a lot of figuring out tones from context, like it's some sort of grand word game, always seemed very unsatisfying.

  5. Gene Callahan said,

    May 5, 2014 @ 10:22 am

    At my blog, I'm planning my posts to be erotic over the next few weeks.

  6. richard said,

    May 6, 2014 @ 12:29 pm

    If you're interested in the interaction of melody and linguistic tone, you might want to ask an ethnomusicologist rather than a linguist.

RSS feed for comments on this post · TrackBack URI

Leave a Comment