- Website: http://ling.upenn.edu/~myl
Posts by Mark Liberman:
This is a reality check on the current state of automatic speech recognition (ASR) algorithms. I took the 186-word passage by Scottie Nell Hughes discussed in yesterday's post, and submitted it to two different Big-Company ASR interfaces, with amusing results. I'll be interested to see whether other systems can do better. Read the rest of this entry »
Read the rest of this entry »
I was surprised to learn that Scottie Nell Hughes has a broadcast communications/political science degree from the University of Tennessee at Martin, rather than a degree in literary theory from Florida International University. This makes her ideas about the relationship of texts to states of affairs all the more remarkable, since she has apparently developed them independently of Stanley Fish, rather than under his guidance.
Here's the most recent evidence of her theoretical sophistication:
From Harry Asche:
I am a tunnel engineer, and the patron saint of tunnelling is St Barbara. Her saint's day is the 4th of December. I am a bit of a tunnel nut, so on a visit to a church in Europe, I bought a St Barbara card, printed on a handy credit-card sized piece of plastic.
Here is the Prayer to St Barbara, transcribed exactly:
"O God, who among the other miracles of Your power, have given even to the weaker sex the victory of martyrdom, grant, we beseech You, that we, who are celebrating the heavenly birthday of Blessed Barbara, Your Virgin and Martyr, may, by her example, draw nearer to you. Amen."
I am greatly impressed by the complexity of the first sentence. The core is "O God grant that we may draw nearer to you." But this is interrupted by no less than six additional clauses. "That we" and "may" live in little islands on their own.
Is this a result of translation from the Latin? I did five years of Latin at school and it had a lasting negative affect on my ability to write English.
This is a guest post by Bill Badecker, Linguistics Program Director at the National Science Foundation.
Subject: Language & Communication: Request for Information
“As the car is hurtling towards the cliff, it’s driving on quicksand,” Levitt said.
The last month or so has seen renewed discussion of the benefits and dangers of artificial intelligence, sparked by Stephen Hawking's speech at the opening of the Leverhulme Centre for the Future of Intelligence at Cambridge University. In that context, it may be worthwhile to point again to the earliest explicit and credible AI warning that I know of, namely Norbert Wiener's 1950 book The Human Use of Human Beings [emphasis added]:
[T]he machine plays no favorites between manual labor and white-collar labor. Thus the possible fields into which the new industrial revolution is likely to penetrate are very extensive, and include all labor performing judgments of a low level, in much the same way as the displaced labor of the earlier industrial revolution included every aspect of human power. […]
The introduction of the new devices and the dates at which they are to be expected are, of course, largely economic matters, on which I am not an expert. Short of any violent political changes or another great war, I should give a rough estimate that it will take the new tools ten to twenty years to come into their own. […]
Let us remember that the automatic machine, whatever we think of any feelings it may have or may not have, is the precise economic equivalent of slave labor. Any labor which competes with slave labor must accept the economic conditions of slave labor. It is perfectly clear that this will produce an unemployment situation, in comparison with which the present recession and even the depression of the thirties will seem a pleasant joke. This depression will ruin many industries-possibly even the industries which have taken advantage of the new potentialities. However, there is nothing in the industrial tradition which forbids an industrialist to make a sure and quick profit, and to get out before the crash touches him personally.
Thus the new industrial revolution is a two-edged sword. It may be used for the benefit of humanity, but only if humanity survives long enough to enter a period in which such a benefit is possible. It may also be used to destroy humanity, and if it is not used intelligently it can go very far in that direction.
The "silly AI doing something stupidly funny" trope is a powerful one, partly because people like to see the mighty cast down, and partly because the "silly stupid AI" stereotype is often valid.
But as with stereotypes of human groups, the most viral examples are often fakes. Take the "Voice Recognition Elevator" skit from a few years ago, which showed an ASR system that was baffled by a Scottish accent, epitomizing the plight of Scots trapped in a dehumanized world that doesn't understand them. But back in the real world, I found that when I played the YouTube version of the skit to the 2010 version of Google Voice on my cell phone, it correctly transcribed the whole thing.
And I suspect that the recent viral "tuba-to-text conversion" meme is another artful fraud.
Recently, a series of serendipitous connections led me to read Mary Astell's work, A serious proposal to the ladies, for the advancement of their true and greatest interest, first published in 1694. And this experience led me to two questions, the first of which is, Why in the world are Mary Astell's works not available in a readable plain text form, from sources like Project Gutenberg and Wikisource?
Astell's Wikipedia entry explains that she "was one of the first English women to advocate the idea that women were just as rational as men, and just as deserving of education." And she is important enough to merit an entry in the Stanford Encyclopedia of Philosophy, which describes at length her contributions to metaphysics and epistemology.
I know that the first-order reason for this lacuna is that OCR is still pathetically incapable of dealing with 17th-century printing, and that no volunteers have stepped forward to transcribe her writings from the available paper or image sources. But this doesn't really answer the question, it just moves it back a step.
Anyhow, my second question is one that I've wondered about before, without ever trying to find an answer: Why did authors from Astell's time distribute initial capital letters in the apparently erratic way that they did?
Yesterday evening I wound up spending several hours in the Ezeiza airport in Buenos Aires, and the result was a brilliant idea. Or maybe an idle fantasy — you decide.
The Oxford Dictionaries 2016 Word Of The Year is post-truth, which they define as "relating to or denoting circumstances in which objective facts are less influential in shaping public opinion than appeals to emotion and personal belief". Here's their graph of its recent rise in frequency over the past seven months:
It's been a while since we posted a nomination for the Trent Reznor Prize for Tricky Embedding — I believe that the most recent nomination was in April of 2012. But here's a worthy suggestion from Laura Bailey:
— Laura Bailey (@linguistlaura) November 13, 2016
From Jenny Chu, on November 9:
I am a long-time follower of Language Log but usually comment on the Chinese and Vietnamese related topics by Prof. Mair. Yet I thought you might be amused by the attached conversation. It shows some nice examples of the playfulness and creativity of the human language faculty, as well as some nicely ironic / self-conscious prescriptivist poppycock.
The conversation starts like this:
Click here to read the whole (long) thing.
A few days ago, someone asked me a question about a common situation that's rarely discussed: How can an adult learn to communicate in a language they don't know, without access to courses and books and instructors? And what if the problem isn't just lack of foresight and preparation, because no courses or books or instructors exist for the language or dialect in question?
This question's background is an international development project, where many of the people to be reached are illiterate speakers of undocumented and unwritten languages, and are also often not fluent in the local lingua franca.
Some people may be skeptical of various aspects of the premise. But let's grant it and try to address the question.