Archive for Computational linguistics

Mapping the Demographics of American English with Twitter

[This is a guest post by David Bamman.]

It took me a while to really make sense of Twitter. For the longest time, it was (to me) the stomping ground of 14-year-olds and Ashton Kutcher, each issuing a minute-by-minute feed of their lives. Around the time Twitter arrived, however, I had just had a breakthrough on YouTube's enormous popularity – it was only after watching a dozen different videos of the Super Mario Brothers theme song performed a dozen different ways that I finally got it: I may not care about cats playing the keyboard or wedding parties dancing down the aisle, but somebody does, and without a distribution system for people to broadcast whatever their hearts felt like, I never would have had my life improved by that kid with the beatboxing flute or the one with the double guitar.

So I waited for a similar breakthrough with Twitter. It came, at long last, after I realized that it was exactly what I first thought it was: 14-year-olds (and Ashton Kutcher) chronicling the minutiae of their lives. It is colloquial language, constrained by 140 characters: everyday conversations about waiting in line at the grocery store, your flight just landing at ORD, what to do this Saturday night, "omg did u see hr dress?" In spurts it is, of course, much more than that, as its use during the protests of the 2009 Iranian election proved, but in its unmarked use, it's the language of how millions of people across the world talk to their friends.

Read the rest of this entry »

Comments (26)

Free Summer School

Busy June 20 – June 26? Could you manage to squeeze one of the most intellectually intense weeks of your life into your summer schedule? For free?

NASSLLI PICI'm talking (once again!) about the North American Summer School in Logic, Language, and Information (NASSLI 2010), of which I am program chair. It's aimed at graduate students, researchers, and advanced undergraduates, in fact anyone interested in formal approaches to language, philosophy, and computation. And I bring you, Language Log reader, some hot news that gives you the chance of attending the school and making 100-150 new friends for life for free… provided you apply by June 1.

Here's the news (and this is aimed at students). The National Science Foundation has given preliminary approval for a sizable grant to NASSLLI 2010. Together with other funds we have raised this will enable us to provide students with financial support to attend the school. We expect to be able to reimburse the registration fees of about 40 deserving students, and to pay further travel expenses for those whose need is greatest. You can find online information on how to register and how to apply for the grants – see the Support is Available from NASSLLI Itself section on the NASSLLI grants page. Basically, you need to send NASSLLI an email with a reason why NASSLLI is relevant for you, and have your academic advisor send an email too.

I'm really, really looking forward to meeting many of you in Bloomington, Indiana at the end of next month, and if you want to ask me personally about it, send me an email.

Comments (5)

Death or birth?

The most recent IEEE Signal Processing Society Newsletter has an interesting article by David Suendermann, "Speech scientists are dead. Interaction designers are dead. Who is next?".

His argument is that "Commercial spoken dialog systems can process millions of calls per week", and therefore "one can implement a variety of changes at different points in the application and randomly choose one competitor every time the point is hit in the course of a call", using techniques like reinforcement learning to adaptively optimize the design. As a result, "the contender approach can change the life of interaction designers and speech scientists in that best practices and experience-based decisions can be replaced by straight-forward implementation of every alternative one can think of".

Read the rest of this entry »

Comments (17)

Second. Best. Summer. School. Ever.

NASSLLI PIC NASSLLI 2010 is a week long summer school that offers 15 superlative graduate level courses and workshops on Language, Logic and Information from leading scholars, plus pre-session tutorials to bring you up to speed. And the price is incredibly low, just $150 for the entire week if you're a student and register by May 1.

"Second best"? We'll come to that.

Read the rest of this entry »

Comments (4)

Uh accommodation?

In the course of an enjoyable conversation over lunch yesterday, Michael Chorost asked whether disfluency is contagious, in the sense (for example) that talking with someone who uses "uh" a lot would tend to lead you to behave similarly.  This seems plausible, since such effects can be found in most other variable aspects of speech and language use, so I promised to check — with a warning that causation is especially difficult to infer from correlation in such cases.

Read the rest of this entry »

Comments (16)

Pictish writing?

According to Jennifer Viegas, "New Written Language of Ancient Scotland Discovered", Discovery News, 3/31/2010:

Once thought to be rock art, carved depictions of soldiers, horses and other figures are in fact part of a written language dating back to the Iron Age.

The ancestors of modern Scottish people left behind mysterious, carved stones that new research has just determined contain the written language of the Picts, an Iron Age society that existed in Scotland from 300 to 843.

The "new research" is described in Rob Lee, Philip Jonathan, and Pauline Ziman, "Pictish symbols revealed as a written language through application of Shannon entropy", Proceedings of the Royal Society A, in press.

Read the rest of this entry »

Comments (53)

Clinical applications of speech technology

I'm spending this week at IEEE ICASSP 2010 in Dallas.  ICASSP stands for "International Conference on Acoustics, Speech and Signal Processing", and it's one of those enormous meetings with a couple of thousand attendees. This one has more than 120 sessions, with presentations on topics ranging from "Pareto-Optimal Solutions of Nash Bargaining Resource Allocation Games with Spectral Mask and Total Power Constraints" to "Matching Canvas Weave Patterns from Processing X-Ray Images of Master Paintings".

Read the rest of this entry »

Comments (14)

Update on the annihilation of Computational Linguistics at KCL

[What follows is a guest post from Robin Cooper, Professor of Computational Linguistics, Department of Philosophy, Linguistics and Theory of Science, and Director of the Graduate School of Language Technology, University of Gothenburg. He reports on the ill-considered and appallingly executed destruction of the Computational Linguistics group at King's College London. — David Beaver]

The crisis at King's College, London and in particular the targeting for redundancy of its computational linguists and logicians has stirred significant international protest (see http://sites.google.com/site/kclgllcmeltdown/). Many hundreds of highly distinguished scholars from around the world have organized letters of protest querying the rationale behind these moves, which have happened at the same time as the College invested more than £20 million in acquiring Somerset House, a prime piece of central London real estate. Moreover, in contrast to universities that have undergone similar budgetary pressures in the US (e.g. in the UC system where senior faculty have been asked to take pay cuts in order to preserve jobs), at KCL moves towards firing permanent staff has been the first resort.

Read the rest of this entry »

Comments (18)

The perils of recursion

Via BoingBoing and many LL readers:

Comments (24)

So many languages, so much technology…

Suppose you had 100 digital recorders and 800 small languages, all in a country the size of California, but in one of the remotest parts of the planet.  What would you do?  What would it take to identify and train a small army of language workers?  How could the recordings they collect be accessible to people who don't speak the language?  My answer to this question is linked below – but spend a moment thinking how you might do this before looking.  One inspiration for this work was Mark Liberman's talk The problems of scale in language documentation at the Texas Linguistics Society meeting in 2006, in a workshop on Computational Linguistics for Less-Studied Languages.  Another inspiration was observing the enthusiasm of the remaining speakers of the Usarufa language to maintain their language (see this earlier post).  About 9 months ago, I decided to ask Olympus if they would give me 100 of their latest model digital voice recorders.  They did, and the BOLD:PNG Project starts next week.  Please sign the guestbook on that site, or post a comment here, if you'd like to encourage the speakers of these languages who are getting involved in this new project.

Comments (13)

The annihilation of computational linguistics at KCL

[What follows is a guest post reporting on a very disturbing situation at King's College London involving the sacking of senior computational linguists and others in a secretly planned, tragically stupid, and farcically implemented mass-purge. The author of the post is currently employed at KCL, and for obvious reasons must remain anonymous here.

Although it is clear that KCL is suffering from severe budgetary problems, the administration has reacted to the problems inappropriately and unconscionably: the administration is sacking some of KCL's most successful, academically productive and influential scholars, showing arbitrariness and short-sightedness in its decision making, and acting with extreme callousness in the manner by which the decisions have been imposed on the victims.

For those out of the field, I would note that I and other Language Loggers are intimately acquainted with the work of those under fire at KCL. It is among the most important work in syntax, semantics, pragmatics, and computational linguistics, presenting ideas that many of us cite regularly and have absorbed into our own work, and which nobody in the field can ignore. – David Beaver]

Philosophers have been aghast at recent developments at King's College, London
where three senior philosophers, Prof Shalom Lappin, Dr Wilfried Meyer-Viol and Prof Charles Travis, have been targetted for redundancy as part of a restructuring plan for the KCL School of Arts and Humanities. The reason for targetting Lappin and Meyer-Viol has been explained to be that KCL is `disinvesting' from Computational Linguistics. One of the many puzzling aspects of this supposed explanation for targetting Lappin and Meyer-Viol is that there is no computational linguistics unit in Philosophy to disinvest from. (For detailed coverage see the Leiter Report here, here, and here, and these letters protesting the actions taken in the humanities.)

Read the rest of this entry »

Comments (14)

Language-related efforts to help out in Haiti

Posting on behalf of Phil Resnik:

This post brings together a bunch of news about language-related efforts to help out in Haiti:

Read the rest of this entry »

Comments (6)

More models of binomial order

Following up on "The order of ancestors" (12/24/2009) and "Sexual orders" (12/27/2009), I need to note one other important recent paper: Sarah Benor and Roger Levy, "The Chicken or the Egg? A Probabilistic Analysis of English Binomials", Language 82(2): 233-278, 2006. And several readers have pointed me to an older tradition of corpus linguistics that comes to a different set of conclusions about binomial ordering: Mishnah Keritot 6:9, etc.

Read the rest of this entry »

Comments (8)