- Website: http://ling.upenn.edu/~myl
Posts by Mark Liberman:
Ethan Weston & Carter Woodiel, "Paper Fully Written By iOS Autocomplete Accepted By Physics Conference", Newsy 10/23/2016:
A nonsensical academic paper on nuclear physics written only by iOS autocomplete has been accepted for a scientific conference.
Christoph Bartneck, an associate professor at the Human Interface Technology laboratory at the University of Canterbury in New Zealand, received an email inviting him to submit a paper to the International Conference on Atomic and Nuclear Physics in the US in November.
“Since I have practically no knowledge of nuclear physics I resorted to iOS autocomplete function to help me writing the paper,” he wrote in a blog post on Thursday.
“I started a sentence with ‘atomic’ or ‘nuclear’ and then randomly hit the autocomplete suggestions. “The text really does not make any sense.”
An unusually large number of people have suggested that I should post about the latest SMBC comic. Since I'm on the other side of the world, with slow and erratic internet, I'll just post the link, and note that (implied pornography aside) it would be good phonetics assignment to replace Zach's letter strings with IPA symbols.
This post presents some stuff I did last March — I thought I had blogged about it but apparently I only put it into these lecture notes. It came up in some discussions today in Shanghai, because I thought that maybe similar visualizations might help explore prosodic differences between the speech of English native speakers and Chinese learners of English. This is going to get a little wonkish, so let's start with a picture:
From Francois Lang:
Attached is a photo of a sign in the washroom at Heckman's Deli in Bethesda, MD
I kept waiting for all the employees to wash my hands. I even asked. But nothing. Maybe it was something I said?
Today at ISCSLP2016, Xuedong Huang announced a striking result from Microsoft Research. A paper documenting it is up on arXiv.org — W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, G. Zweig, "Achieving Human Parity in Conversational Speech Recognition":
Conversational speech recognition has served as a flagship speech recognition task since the release of the DARPA Switchboard corpus in the 1990s. In this paper, we measure the human error rate on the widely used NIST 2000 test set, and find that our latest automated system has reached human parity. The error rate of professional transcriptionists is 5.9% for the Switchboard portion of the data, in which newly acquainted pairs of people discuss an assigned topic, and 11.3% for the CallHome portion where friends and family members have open-ended conversations. In both cases, our automated system establishes a new state-of-the-art, and edges past the human benchmark. This marks the first time that human parity has been reported for conversational speech. The key to our system's performance is the systematic use of convolutional and LSTM neural networks, combined with a novel spatial smoothing method and lattice-free MMI acoustic training.
Listening to Donald Trump's 10/14/2016 speech in Charlotte NC, I noticed something that I hadn't noticed in listening to his earlier speeches. He often uses a loud isolated monosyllable as a way of transitioning between phrases — and perhaps also as a substitute for the filled pauses that he almost never uses. Some of these transitional syllables are particles like and, but, so,yet; some of them are subject pronouns, especially we. These are all words that are usually "cliticized", that is, merged phonologically with a following word — and Trump sometimes pronounces them that way. But here's a sample of his isolated ANDs from the Charlotte speech:
Katy Steinmetz, "How Ruth Bader Ginsburg found her voice", Time Magazine:
For three years, NYU linguistics professor emeritus John Victor Singler, along with researchers Nathan LaFave and Allison Shapp, pored over hours of audio of Ginsburg’s remarks at the Supreme Court. They used computer programs to analyze thousands of vowel and consonant utterances during her time arguing cases in the 1970s, and then from the early ’90s onward, after she returned to the court in robes. While one can hear flecks of classic New York features in Lawyer Ginsburg’s remarks—like the pursed, closed-mouthed vowels—her Brooklyn roots are more obvious in the speech of Justice Ginsburg, they found.
Their theory, reported here for the first time, is that “conscious or not,” the lawyer was doing something everyone does, what is known in linguistics as accommodation: adapting our ways of communicating depending on who we’re talking to. Accommodating can be done through word choice, pronunciation, even gestures. A common example would be when someone returns to the town where they grew up and their accent comes roaring back as they talk to friends and family who sound that way, too.
This is the first time that I can recall having seen embedded Soundcloud audio clips in a publication of this kind.
Since Bob Dylan got the Nobel Prize for Literature, here's an old music video with some words to open discussion:
(I'm in China for ten days — Beijing, Tianjin, Shanghai — so posting may be a bit erratic…)
An interesting example of meaningful uh:
As an athlete, I've been in locker rooms my entire adult life and uh, that's not locker room talk.
— Sean Doolittle (@whatwouldDOOdo) October 10, 2016
The effect seems different from um, in a subtle way.
Mollymooly's comment on yesterday's post ("The Donald's THE, again") deserves general attention:
1. A leopard is bigger than a cheetah, though both have spots.
2. The leopard is bigger than the cheetah, though both have spots.
3. Leopards are bigger than cheetahs, though both have spots.
4. The leopards are bigger than the cheetahs, though both have spots.
5. Your leopard is bigger than your cheetah, though both have spots.
6. Your leopards are bigger than your cheetahs, though both have spots.
For me at least, 1 and 3 are generic; 2 can be either generic or specific; ditto 5 and 6 (though generic is very informal); but 4 must be specific. There seem to be restrictions on when "the + plural-noun" can be generic: are these restrictions syntactic, semantic, pragmatic?
From the other side of the Atlantic, I agree with her judgments. Does the intuited specificity of 4 help us understand what's odd about Donald Trump's use of "the women", "the gays", etc.?
There are several literatures (from philosophy of language as well as linguistics) that converge here, and perhaps someone who knows them better than I do can summarize.
One comment: this is an area where there are subtle differences even among those languages that have categories approximately corresponding to English plurality and English definite or indefinite determiners. The Romance languages are clearly different from English here, but are they all the same among themselves? What about Germanic languages?
THE African Americans. THE Latinos. THE women. Objects. You use "the" in front of objects, not people. #debate
— Diana Prichard (@diana_prichard) October 10, 2016
It's not really true that "you use 'the' in front of objects, not people" — today's NYT is full of phrases like "until now the Russians have been on board with regard chemical weapons"; "It is also, as the French like to say, digestible"; "a town in which the inhabitants were abandoned to their executioner". But Diana Prichard is on to something, and she's not the first to notice it.
Oliver Darcy, "REBELLION: RNC staffers 'defying orders' to keep working for Trump, source says", Business Insider 10/8/2016.
So how are those staffers defying orders? Are they ceasing to work for Trump despite orders to continue? In that case, it's "orders to keep working for Trump" that they're defying. Or are they defying instructions (to stop), (in order) to keep working for Trump?
Aaron Dinkin points out that the headline is perfectly ambiguous in this respect. And interestingly, both meanings are consistent with what we know about disagreement and confusion within the Republican party.
"Paul Ryan Refers to Furor Over Trump as Elephant in the Room", Bloomberg News 10/8/2016:
Speaker of the House Paul Ryan spoke at the GOP “Fall Fest” unity event in his home district in Wisconsin. While he did not directly address Donald Trump’s crude and sexually aggressive remarks about women in a 2005 recording, he did refer to the furor over the comments as “a bit of an elephant in the room.” Ryan did hear boos, as did Representative Jim Sensenbrenner, who was heckled by a Trump supporter.
The passage in question:
let me just start off by saying
there is a bit of an elephant in the room
and it is a troubling situation I'm serious it is
I put out a statement about this last night
I meant what I said and it's still how I feel