LRNLP 2018

« previous post | next post »

On Monday, I'm pursuing the quixotic enterprise of talking to an NLP workshop about phonetics.

LRNLP ("Language Resources for NLP") 2018 is a workshop associated with COLING 2018 in Santa Fe NM.  My abstract:

Semi-automatic analysis of digital speech collections is transforming the science of phonetics, and offers interesting opportunities to researchers in other fields. Convenient search and analysis of large published bodies of recordings, transcripts, metadata, and annotations – as much as three or four orders of magnitude larger than a few decades ago – has created a trend towards “corpus phonetics,” whose benefits include greatly increased researcher productivity, better coverage of variation in speech patterns, and essential support for reproducibility.

The results of this work include insight into theoretical questions at all levels of linguistic analysis, as well as applications in fields as diverse as psychology, sociology, medicine, and poetics, as well as within phonetics itself. Crucially, analytic inputs include annotation or categorization of speech recordings along many dimensions, from words and phrase structures to discourse structures, speaker attitudes, speaker demographics, and speech styles. Among the many near-term opportunities in this area we can single out the possibility of improving parsing algorithms by incorporating features from speech as well as text.

Due to semester-initial commitments at Penn, I won't be able to stay for COLING, but I'm looking forward to an interesting day of presentations at the workshop.

 



2 Comments

  1. Bill Benzon said,

    August 18, 2018 @ 6:22 am

    I'm thinking of all those podcasts hanging out on the web. I'd start with Joe Rogan, he of the Intellectual Dark Web.

    Why? Because he's got one of the most downloaded podcasts out there and so is, in some sense, culturally important. Also, his podcasts are interesting on a variety of fronts.

    For one thing, they're conversations, so you've got two, sometimes three or four, speakers. & some of those conversations get quite animated, so there's lots of specifically phonetic information (as opposed to phonemic, etc.) there to track.

    I've had a lot of fun transcribing an annotating a single 10 minute conversation Rogan had with his buddy Joey "Coco" Diaz, though I certainly wasn't even attempting to work at the phonological level (lack of competence). But thematically, turn-taking, Jakobson's 6 functions, lots going on.

  2. Ran Ari-Gur said,

    August 18, 2018 @ 2:17 pm

    > On Monday, I'm pursuing the quixotic enterprise of talking to an NLP workshop about phonetics.

    OK, I'll bite: why is that quixotic? Are NLP folks usually not interested in phonetics?

RSS feed for comments on this post