OK Google

« previous post | next post »

A couple of days ago, I gave a talk at the Centre Cournot on the topic "Why Human Language Technology (almost) works" ("Pourquoi les technologies de la langue et du discours marchent enfin (ou presque)"), and for the introduction, I tried giving Google Now a few questions and instructions on my Android phone.

In case you're not familiar with this feature, you start it up by saying "OK Google", followed by the question you want it to answer or the instruction you want it to follow.

And since the starting-point of my talk was that HLT now actually works well enough to be useful, I was glad to see that my little experiment worked pretty well.

Here are the first few things I tried:

Question: "OK Google, what is the French word for 'dog'?"
Transcription: "what is the French word for dog?"
Answer (spoken as well as shown in text): "chien"

Question: "OK Google, what is 15 degrees centigrade in Fahrenheit?"
Transcription: "what is 15 degrees centigrade in Fahrenheit?"
Answer (spoken as well as shown in text): "15 degrees Celsius is 59 degrees Fahrenheit."

Question: "OK Google, What's the name of the student newspaper at the University of Pennsylvania?"
Transcription: "What's the name of the student newspaper at the University of Pennsylvania?"
Answer: A page of search links, with the Daily Pennsylvanian at the top.

Question: "OK Google, Note to self — buy paper towels."
Transcription: "note to self buy paper towels"

Answer:

Question: "what is the URL of Language Log?"
Transcription: "what is the URL of language log"
Answer: A list of search results, topped by the Language Log Facebook page.

At this point, I began to worry that the "almost" qualifier of my title might be in danger, at least without introducing some background noise or simulating laryngitis, so I tried something weird. One of the few books that I brought with me to France was ggplot2 by Hadley Wickham, and it was sitting on the corner of my desk, so I asked

Question: "OK Google, when was Hadley Wickham's book ggplot2 published?"
Transcription: "when was Hadley Wickham zbook ggplot2 published"
Answer: Page of search results with the Amazon listing for ggplot2 at the top.

How they got zbook into their lexicon and language model is a mystery, but the whole thing still basically worked, even if getting to the answer required drilling down into the listing for the book. So going further into the improbable, I asked:

Question: "OK Google, what is the word for 'dog' in Hausa?"
Transcription: "what is the word for dog in hausa"
Answer: "Here is your translation:

In search of some more convincing failures, I turned to Google Translate. And there I confirmed my prior belief that pronouns and idiomatic fixed expressions sometimes remain a problem.

For example, in translating sentences from the Cournot Center's "Présentation" page, I found things like this:

Le Centre Cournot est une association soutenue par la Fondation Cournot, placée sous l'égide de la Fondation de France. Elle porte le nom du mathématicien et philosophe franc-comtois Augustin Cournot (1801-1877), reconnu de longue date comme un pionnier de la discipline économique.

The Cournot Centre is an association supported by the Cournot Foundation, under the aegis of the Fondation de France. It is named after the mathematician and philosopher Franche-Comte Augustin Cournot (1801-1877), long recognized as a pioneer of economic discipline.

The phrase "la discipline économique" ought to be "the discipline of economics", not "economic discipline", which sounds like another way of saying "balanced budgets" or the like.

Google Translate did correctly render elle as "it" rather than "she". But a bit later in the text, we get two instances of il referring to "le centre", where the first one is translated as "it" but the second one as "he":

Le Centre n'est pas un laboratoire de recherche, il n'est pas non plus un centre de réflexion. Il jouit de l'indépendance singulière d'un catalyseur.

The Centre is not a research laboratory, it is not a think tank. He enjoys the singular independence of a catalyst.

Finally, I tried the opening lines of a recent roman policier I've been reading, Yasmina Khadra's Le dingue au bistouri:

Il y a quatre choses que je déteste.
Un: qu'on boive dans mon verre.
Deux: qu'on se mouche dans un restaurant.
Trois: qu'on me pose un lapin.
[…]

There are four things I hate.
A: we drink in my glass.
Two: we will fly in a restaurant.
Three: I get asked a rabbit.
[…]

Finally, some support for my "almost"! The first two instances of on should be translated as "somebody", not "we"; on se mouche means "somebody blows their nose", not "we will fly"; and on me pose un lapin mean "somebody stands me up", not "I get asked a rabbit" (though "I get asked" for "on me pose" is a good try…).

And a final practical example: on my way out the door, planning to walk to the location of the talk, I asked

"OK Google, Navigate to Télécom Paris Tech"

with my best French pronunciation of the destination, and got the completely unhelpful transcription: "Navigate to telecom Perry tech".  (It seems that there is a "Perry Technical Institute" in Yakima, WA — and Google helpfully told me about all the possible air travel connections…)

But when I asked again with the normal English pronunciation of "Paris", the request worked, and landed me in Google maps navigation with an appropriate destination.



14 Comments

  1. Phillip Minden said,

    May 23, 2015 @ 4:27 am

    For the cause of demonstrating "almost", it's a pity Google got "qu'on" right in principle.

  2. Laura Morland said,

    May 23, 2015 @ 4:35 am

    Loved this post! I, too, have discovered how thoroughly (if not spookily) Google mines popular culture for the names of books, song titles, and the like. On the other hand, I haven't discovered the key to making Google voice detection understand when I am speaking French and not English. Sometimes it works perfectly, and other times it's rendered as hilariously garbled French.

    I do have one possible clue: Google search for context: yesterday I was texting my niece back and forth in English, and then suddenly switched to French, which Google interpreted as gobbledygook. On a hunch, I "left the conversation" and started a fresh text , which it rendered perfectly into French (and all subsequent texts as well).

    I also know that you can't mix languages in one message. If I want to insert a French word into an English text, I deliberately pronounce it in the American way; e.g., PAH-riss, Bas-TEEL, and that gets me the desired result.

  3. Laura Morland said,

    May 23, 2015 @ 4:37 am

    P.S. Did your "note to self" end up in your Calendar? (If not, how did you remember to pick up the essuie-tout ?)

  4. James Sinclair said,

    May 23, 2015 @ 8:24 am

    My wife was experimenting with this a few months ago and I suggested asking, "where is the next Super Bowl?" The answer was a page of search results, mostly about the Super Bowl that had just been played, and a few about the Super Bowl in general—nothing that provided the information we were looking for in any obvious way.

    If you ask the same question in non-relative terms—"where is Super Bowl 50?" or "where is the 2016 Super Bowl?"—you'll get the answer (Levi's Stadium in Santa Clara). But Google still seems to struggle with questions where the desired information depends in part on when the question was asked.

  5. Faith said,

    May 23, 2015 @ 1:17 pm

    @Laura Morland–Last week I completely baffled Google Now while trying to find a shuttle bus from Charles de Gaulle airport into Paris. My mistake was pronouncing Charles de Gaulle in French instead of English like the rest of the query. I eventually I tried it using an exaggerated American pronunciation and it worked. I just tried it again, and now the French pronunciation works. Spooky.

  6. Mark Meckes said,

    May 23, 2015 @ 2:47 pm

    My personal most reliable source of Google Translate failures is in translating negative constructions between French and English.

  7. Pflaumbaum said,

    May 23, 2015 @ 3:11 pm

    Using a less widely spoken language on Google Translate gets you deep into 'almost' territory.

    In my experience the Romanian version is very hit-and-miss – though also very useful.

  8. D.O. said,

    May 23, 2015 @ 5:21 pm

    So what is the fourth thing the guy detests. Let me google… OK, now
    Quatre : rester là, à ne rien foutre, dans mon bureau minable au fond d'un couloir cafardeux où les relents des latrines et les courants d'air adorent flûter.
    which GT translates as
    Four: Stay there, do not cum in my shabby office at the end of a hallway where cafardeux hints latrines and drafts love Fluter., which is singularly unhelpful (anyways, none of the previous translation errors were because of the speech recognition, it's translation that was wrong).

    BTW, GT cannot translate cafardeux though it supplies a dictionary definition "Qui a le cafard, mélancolique". Why is that? Cafard is first and foremost a cockroach, but GT knows the expression avoir le cafard and mélancolique is even easier…

  9. Ran Ari-Gur said,

    May 23, 2015 @ 10:11 pm

    I also like that Google translated "Un:" as "A:" rather than "One:". It almost works, just by coincidence, except for the non-parallelism with "Two:" rather than "B:" on the following line.

  10. Dominik Lukes (@techczech) said,

    May 24, 2015 @ 1:44 am

    This is a very one-sided Google-friendly interpretation of almost. You asked Google the sorts of questions it gets right often enough for some people, not often enough for me to rely on in daily use. I use it to set timers on my Android Wear watch and it's 100% accurate if also very slow. If I try to search for anything more complex, I always have to guess at what the Google-friendly way is.

    This is not human language. It is human language seen through the interface and limitation (API) of the technology. Try dictating and self-correcting. Have a look at automatically generated subtitles on YouTube. Try to translate a text with multiple actors and actions. I've tried a few Google translate articles on Czech and while it can get the gist of individual sentences, you end up not knowing what happened.

    In fact, human language technology (almost) never works. What this captured was the (almost) when it occasionally works. What we have a is a human_language-technology interface which is quite remarkable give what we had 5 years ago but we also don't have any guarantees that this will scale up because the underlying structural implementations are still very linguistically basic. I worry that we may be reaching the peak-frequentism and mining every next bit of functionality will be more and more difficult.

  11. Bean said,

    May 25, 2015 @ 10:07 am

    I would also add, you were very helpful to Google by phrasing your question in the way that we have all figured out to phrase search queries, which is not how you would ask an actual person in front of you (at least sometimes). There was a lot of structure to the questions and they were all structured similarly. We have all learned to "translate" what we really want to know into decent Google queries, having used it so much, but we don't really talk like that.

    e.g. 1: Depending on context, I might say, "How do you say "dog" in French?" Presumably google can also handle that.

    e.g. 2: A question to a person might be part of a conversation with articles or pronouns that might still confuse google:

    A. Did you read the new book by Hadley Wickham?
    B. He has a new book out?
    A. Yeah, it's called ggplot2
    B. Huh. When did it come out?
    A. Dunno, this year, I think. Let's Google it!

  12. Robot Therapist said,

    May 25, 2015 @ 2:05 pm

    I am sure we have all learned, perhaps unconsciously, how to phrase queries to Google so that they work. It has trained us, as well as us training it.

    I remember about 25 years ago being shown (by John Sowa) a natural language database query interface (keyboard, not voice). The database included a table of names and birthdays. You could ask for example "who has a birthday in October?" However the query "whose birthday is it today" broke it.

  13. Yosemite Semite said,

    May 25, 2015 @ 10:57 pm

    I recently have had occasion to translate some articles in German newspapers into English. I am a German speaker, although not a native speaker, and went to the university in Germany, but sometimes I find it easier to give the translation a shot with Google Translate because there are often words and phrases that I have to puzzle over, thinking, "Now how do you say that in English?" Google Translate is useful as a first cut at the work. There are a number of things where its renderings are problematic. It often gets confused about the various forms of "sie," which in English can be "she," "they," or "you," which it often renders incorrectly. It doesn't seem to take into account capitalization or not of those forms. It also gets confused about various impersonal forms, like "es gibt," "man sollte," and the like. Word order also gives it trouble; German, because it still has a strong case system, has a freer order than English. Reported or indirect speech, which in German is commonly expressed in the subjunctive, also is troublesome, often because an introductory phrase, such as "er sagte," is implied rather than explicit in the preceding context. Those are just some examples that I remember off the top of my head from my exercises. That is all apart from rendering colloquialisms, dialect and idioms usefully. I can't imagine anyone without some familiarity with German making sense out of the results from Google Translate's renderings of German into English.

  14. Lane said,

    May 27, 2015 @ 6:15 am

    Mark, I remember from our meeting that you are one of the most deliberate, slow and clear speakers I've ever talked to, at least among those whose remarks I've later had to transcribe. I usually have to stop and start my recorder as I transcribe, but I could more or less transcribe you word-for-word as I played the tape at natural speed. I even sped the recorder up to transcribe some sections more quickly.

    I by contrast tend to speak quickly and mumble, probably one reason Google Translate (and Siri) often pretty hilariously mis-transcribe and so misinterpret what I ask them to do.

    I wrote about struggling with the new GT simultaneous translator here:

    http://www.economist.com/blogs/prospero/2015/02/google-translate

    As for getting word-for-word transcription perfect but flailing semantically, I once asked Siri a question I knew might throw her. We were trying to name the actor who plays "Thor" in the Marvel movies. I asked "Siri, who plays Thor in 'Thor'?" Her answer was "Sorry, I can't find any movie theaters in Thor, IA playing 'Thor'." Classic! Her parse was plausible – "Who [among movie theaters] plays [the movie] Thor in Thor [Iowa]?", and yet ridiculous at the same time… There are 186 people in Thor, Iowa, according to Google, and probably not a movie theater at all.

    This is absolutely true, despite having the feel of the urban legend about "The vodka is good but the steak is awful", so I made a screen-grab of it just so people would believe me.

RSS feed for comments on this post