Language Log

Archive for Computational linguistics

New approaches to Alzheimer's Disease

April 8, 2020 @ 7:58 am· Filed by Mark Liberman under Biology of language, Computational linguistics

This post is another pitch for our on-going effort to develop simple, easy, and effective ways to track neurocognitive health through short interactions with a web app. Why do we want this? Two reasons: first, early detection of neurodegenerative disorders through near-universal tracking; and second, easy large-scale evaluation of interventions, whether those are drugs or lifestyle changes. You can participate by enrolling at https://speechbiomarkers.org, and suggesting it to your friends and acquaintances as well.

Today, diagnosis generally depends on scoring below a certain value on cognitive tests such as the MMSE, which usually won't even be given until you've started experiencing life-changing symptoms — and at that point, the degenerative process has probably been at work for a decade or more. This may well be too late for interventions to make a difference, which may help explain the failure of dozens of Alzheimer's disease drug trials. And it's difficult and expensive to evaluate an intervention, in part because it requires a series of clinic visits, making it hard to fund support for trials that don't involve a patented drug.

If people could accurately track their neurocognitive health with a few minutes a week on a web app, they could be alerted to potential problems by the rate of change in their scores, even if they're many years away from a diagnosis by today's methods. Of course, this will be genuinely useful only when we have ways to slow or reverse the process — but the same approach can be used to evaluate such interventions inexpensively on a large scale.

More background is here: "Towards tracking neurocognitive health", 3/24/2020. As that post explains, this is just the first step on what may be a long journey — but we will be making the data available to all interested researchers, so that the approaches that have worked elsewhere in AI research over the past 30 years can be be applied to this problem as well.

Again, you can participate by enrolling at https://speechbiomarkers.org . And please spread the word!

Read the rest of this entry »

Permalink Comments (2)

What do you hear?

March 1, 2020 @ 2:49 am· Filed by Mark Liberman under Computational linguistics, Psychology of language

Listen to this sound, and describe it in the comments below:

You can learn what the sound is, and why I care how you hear it, after the fold.

Read the rest of this entry »

Permalink Comments (41)

Standardized Project Gutenberg Corpus

December 28, 2019 @ 11:16 am· Filed by Mark Liberman under Computational linguistics

Martin Gerlach and Francesc Font-Clos, "A standardized Project Gutenberg corpus for statistical analysis of natural language and quantitative linguistics", arXiv 12/19/2018:

The use of Project Gutenberg (PG) as a text corpus has been extremely popular in statistical analysis of language for more than 25 years. However, in contrast to other major linguistic datasets of similar importance, no consensual full version of PG exists to date. In fact, most PG studies so far either consider only a small number of manually selected books, leading to potential biased subsets, or employ vastly different pre-processing strategies (often specified in insufficient details), raising concerns regarding the reproducibility of published results. In order to address these shortcomings, here we present the Standardized Project Gutenberg Corpus (SPGC), an open science approach to a curated version of the complete PG data containing more than 50,000 books and more than 3×10⁹ word-tokens. Using different sources of annotated metadata, we not only provide a broad characterization of the content of PG, but also show different examples highlighting the potential of SPGC for investigating language variability across time, subjects, and authors. We publish our methodology in detail, the code to download and process the data, as well as the obtained corpus itself on 3 different levels of granularity (raw text, timeseries of word tokens, and counts of words). In this way, we provide a reproducible, pre-processed, full-size version of Project Gutenberg as a new scientific resource for corpus linguistics, natural language processing, and information retrieval.

Read the rest of this entry »

Permalink Comments (1)

Long ago, in a narratology far away…

December 20, 2019 @ 8:56 am· Filed by Mark Liberman under Computational linguistics, Linguistics in the comics

Louisa Shepard, "‘May the force be with you’ and other fan fiction favorites", Penn Today 12/18/2019:

Starting with Star Wars, Penn researchers create a unique digital humanities tool to analyze the most popular phrases and character connections in fan fiction. […]

The Penn team started with the script of “Star Wars: The Force Awakens” and created algorithms to analyze the words in the script against those in millions of fan fiction stories. The unique program identifies the most popular phrases, characters, scenes, and connections that are repurposed by these writers and then displays them in a simple graph format.

The results are now available on their “fan engagement meter” at https://fanengagement.org.

Serendipitously, today's xkcd:

Read the rest of this entry »

Permalink Comments (1)

Mrs. Transformer-XL Tittlemouse

December 18, 2019 @ 8:19 pm· Filed by Mark Liberman under Computational linguistics, Elephant semifics

This is another note on the amazing ability of modern AI learning techniques to imitate some aspects of natural-language patterning almost perfectly, while managing to miss common sense almost entirely. This probably tells us something about modern AI and also about language, though we probably won't understand what it's telling us until many years in the future.

Today's example comes from Zihang Da et al., "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context", arXiv 6/2/2019.

Read the rest of this entry »

Permalink Comments (5)

Canoe schemata nama gary anaconda

December 17, 2019 @ 10:44 am· Filed by Mark Liberman under Computational linguistics, Psychology of language

Following up on recent posts suggesting that speech-to-text is not yet a solved problem ("Shelties On Alki Story Forest", "The right boot of the warner of the baron", "AI is brittle"), here's a YouTube link to a lecture given in July of 2018 by Michael Picheny, "Speech Recognition: What's Left?" The whole thing is worth following, but I particularly draw your attention to the section starting around 50:06, where he reviews the state of human and machine performance with respect to "noise, speaking style, accent, domain robustness, and language learning capabilities", with the goal to "make the case that we have a long way to go in [automatic] speech recognition".

Read the rest of this entry »

Permalink Comments (4)

AI is brittle

December 11, 2019 @ 4:21 am· Filed by Mark Liberman under Computational linguistics, Elephant semifics

Following up "Shelties On Alki Story Forest" (11/26/2019) and "The right boot of the warner of the baron" (12/6/2019), here's some recent testimony from engineers at Google about the brittleness of contemporary speech-to-text systems: Arun Narayanan et al., "Recognizing Long-Form Speech Using Streaming End-To-End Models", arXiv 10/24/2019.

The goal of that paper is to document some methods for making things better. But I want to underline the fact that considerable headroom remains, even with the massive amounts of training material and computational resources available to a company like Google.

Modern AI (almost) works because of machine learning techniques that find patterns in training data, rather than relying on human programming of explicit rules. A weakness of this approach has always been that generalization to material different in any way from the training set can be unpredictably poor. (Though of course rule- or constraint-based approaches to AI generally never even got off the ground at all.) "End-to-end" techniques, which eliminate human-defined layers like words, so that speech-to-text systems learn to map directly between sound waveforms and letter strings, are especially brittle.

Read the rest of this entry »

Permalink Comments (6)

The right boot of the warner of the baron

December 6, 2019 @ 8:50 am· Filed by Mark Liberman under Computational linguistics

Here at the UNESCO LT4All conference, I've noticed that many participants assert or imply that the problems of human language technology have been solved for a few major languages, especially English, so that the problem on the table is how to extend that success to thousands of other languages and varieties.

This is not totally wrong — HLT is a practical reality in many applications, and is being rapidly spread to others. And the problem of digitally underserved speech communities is real and acute.

But it's important to understand that the problems are not all solved, even for English, and that the remaining issues also represent barriers for extensions of the technology to other communities, in that the existing approximate solutions are far too hungry for data and far too short on practical understanding and common sense.

Read the rest of this entry »

Permalink Comments (9)

Command your kitchen

November 29, 2019 @ 11:05 am· Filed by Mark Liberman under Computational linguistics

…or at least the faucets in it, using Delta's VoiceIQ Technology.

Delta VoiceIQ Technology pairs with your connected home device to give you exactly the amount of water you need with features like metered dispensing and custom container commands.

I have to say that being able to tell my kitchen faucet to dispense 137 milliliters of hot water, or whatever, is not high on my list of desires. I'm happy enough with good old-fashioned indoor plumbing, reliable supplies of potable water, and filters to take care of residual issues. But apparently the market-research folks at Delta think that the faucet-buying public is more forward-looking than I am.

Read the rest of this entry »

Permalink Comments (4)

Shelties On Alki Story Forest

November 26, 2019 @ 7:06 am· Filed by Mark Liberman under Computational linguistics, Language and medicine

Last week I gave a talk at an Alzheimer's Association workshop on "Digital Biomarkers". Overall I told a hopeful story, about the prospects for a future in which a few minutes of interaction each month, with an app on a smartphone or tablet, will give effective longitudinal tracking of neurocognitive health.

But I emphasized the fact that we're not there yet, and that some serious research and development problems stand in the way. In particular, the current state of the art in speech recognition is not yet good enough for reliable automated evaluation of spoken responses.

Read the rest of this entry »

Permalink Comments (2)

A diarization corpus from Amazon

November 5, 2019 @ 11:00 am· Filed by Mark Liberman under Computational linguistics

About a month ago, Zaid Ahmed and others in Amazon's speech research group released DiPCo ("Dinner Party Corpus"), "a new data set that will help speech scientists address the difficult problem of separating speech signals in reverberant rooms with multiple speakers".

The past decade has seen striking progress in Human Language Technology, brought about by new methods, more training data, and (especially) cheaper/faster computers. But this rapid progress highlights the fact that "All problems are not solved", as I wrote last year — and in particular, the central problem of "diarization", or determining who spoken when, has turned out to be a surprisingly difficult one. And diarization is not just hard for conversations at dinner parties.

Read the rest of this entry »

Permalink Comments (2)

Kabbalist NLP

November 2, 2019 @ 10:31 am· Filed by Mark Liberman under Computational linguistics, Elephant semifics

Oscar Schwartz, "Natural Language Processing Dates Back to Kabbalist Mystics", IEEE Spectrum 10/28/2019 ("Long before NLP became a hot field in AI, people devised rules and machines to manipulate language"):

The story begins in medieval Spain. In the late 1200s, a Jewish mystic by the name of Abraham Abulafia sat down at a table in his small house in Barcelona, picked up a quill, dipped it in ink, and began combining the letters of the Hebrew alphabet in strange and seemingly random ways. Aleph with Bet, Bet with Gimmel, Gimmel with Aleph and Bet, and so on.

Abulafia called this practice “the science of the combination of letters.” He wasn’t actually combining letters at random; instead he was carefully following a secret set of rules that he had devised while studying an ancient Kabbalistic text called the Sefer Yetsirah. This book describes how God created “all that is formed and all that is spoken” by combining Hebrew letters according to sacred formulas. In one section, God exhausts all possible two-letter combinations of the 22 Hebrew letters.

By studying the Sefer Yetsirah, Abulafia gained the insight that linguistic symbols can be manipulated with formal rules in order to create new, interesting, insightful sentences. To this end, he spent months generating thousands of combinations of the 22 letters of the Hebrew alphabet and eventually emerged with a series of books that he claimed were endowed with prophetic wisdom.

Permalink Comments (6)

Lombroso and Lavater, reborn as fake AI

October 22, 2019 @ 1:30 pm· Filed by Mark Liberman under Computational linguistics, HLT

Drew Harwell, "A face-scanning algorithm increasingly decides whether you deserve the job", WaPo 10/22/2019:

An artificial intelligence hiring system has become a powerful gatekeeper for some of America’s most prominent employers, reshaping how companies assess their workforce — and how prospective employees prove their worth.

Designed by the recruiting-technology firm HireVue, the system uses candidates’ computer or cellphone cameras to analyze their facial movements, word choice and speaking voice before ranking them against other applicants based on an automatically generated “employability” score.

HireVue’s “AI-driven assessments” have become so pervasive in some industries, including hospitality and finance, that universities make special efforts to train students on how to look and speak for best results. More than 100 employers now use the system, including Hilton, Unilever and Goldman Sachs, and more than a million job seekers have been analyzed.

Read the rest of this entry »

Permalink Comments (21)

« Previous Page — « Previous Entries

Next Entries » — Next Page »

Archive for Computational linguistics

New approaches to Alzheimer's Disease

What do you hear?

Standardized Project Gutenberg Corpus

Long ago, in a narratology far away…

Mrs. Transformer-XL Tittlemouse

Canoe schemata nama gary anaconda

AI is brittle

The right boot of the warner of the baron

Command your kitchen

Shelties On Alki Story Forest

A diarization corpus from Amazon

Kabbalist NLP

Lombroso and Lavater, reborn as fake AI

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta