Our love was real!

« previous post | next post »

I'm in Brighton for InterSpeech 2009, but unfortunately duties in Philadelphia made it impossible for me to make it here in time to act as a human control in the 2009 Loebner Prize competion, the annual administration of the "Turing Test". As the ISCA Secretariat put it,

We are seeking volunteers to pit themselves against the entries — and prove to the judges just how human they are!

The test involves using a computer interface to chat (type messages) for 5 minutes with a judge, who does the same with the program, not knowing which is which. The judge has to determine which is the true human.


It's no accident that the next-to-last xkcd strip dealt with a version of this problem:

[Click on the image for a larger version.]

I've looked around, and asked around, but if the chat logs for this year's competition have been posted, I can't find them.

But there's actually an xkcd VK testing site (that's a reference to the Voight-Kampff machine from Do Androids Dream of Electric Sheep? and Blade Runner). For more discussion, see the xkcd blag:

I hope no hearts out there are broken, but it’s important to know these things. Bots can handle thousands of connections at once, so you don’t know who else your internet partner is chatting with. There’s nothing worse than a Turing Test coming back positive for chlamydia.

[Update — Shalom Lappin responded by email:

Sorry I missed you at Interspeech. I was one of the judges for the Loebner prize. The contest was organized locally by Philip Jackson of the University of Sussex, and he might be able to provide you with the transcripts of the interactions.

None of the judges had any difficulty in distinguishing human from non-human interlocutors after the first or second turn in the conversation. The two main features which allowed me to identify a human vs. a non-human agent are (i) capacity for fluent domain general discourse marked by frequent and unpredictable changes in topic, (ii) willingness to allow the judge to take over the conversation, (iii) capacity to handle ellipsis, pronouns, and non-sentential fragments, and (iv) typing errors and corrections in human but not program contributions. The relative absence of progress in developing general purpose conversational agents contrast sharply with the substantial progress of the past 10-15 years in task driven, domain specific dialogue management systems and other types of NLP.

]



15 Comments

  1. Sili said,

    September 7, 2009 @ 2:31 pm

    They coded that up fast. When I googled it after the strip, all I got was – appropriately? – infested sites.

    How many humans lose fail the Turing test?

    I'm a very sad panda for not being in Blighty these days. Have you met Lynneguist?

  2. Leonardo Boiko said,

    September 7, 2009 @ 3:41 pm

    > How many humans lose fail the Turing test?

    That’s some pretty WTFy phrasing you have there, Mr. “Sili”… would you mind answering this captcha?

  3. Alexandra said,

    September 7, 2009 @ 4:30 pm

    Hmm. In the xkcd strip, Lisa passed the Turing Test, but was tripped up by a Captcha-type test. I don't know if the Captcha method of detecting bots has a name, but it seems fundamentally different from a Turning Test — rather than requiring a human judge to tell whether a bot is human or not, it requires the bot to perform a skill (ostensibly) doable only by humans. This strip implies that it is easier to pass a Turing Test than a Captcha-type test. Does anyone who knows more about this than I do (MYL?) want to weigh in on how likely this is?

  4. Peter Taylor said,

    September 7, 2009 @ 5:45 pm

    Most current CAPTCHAs are easily broken by OCR targeted specifically at them. Those which aren't also trip up some humans.

  5. Leonardo Boiko said,

    September 7, 2009 @ 6:07 pm

    And that unlikely inversion is the source of the strip’s humour.

    Here’s another captcha joke: http://www.kontraband.com/pics/17716/Robot-Tattoo/

  6. Joe Fineman said,

    September 7, 2009 @ 8:53 pm

    From my journal, 8 November 1991:

    Boston…: Computer Museum: contest based on the Turing test. About 20 media people & 50 spectators. 6 computer programs participated, against 2 humans (there were supposed to be 2 more humans, but their connections were faulty). None of the programs fooled me, but one of the humans succeeded in convincing me that she was a computer program — the only halfway good one. The real programs were all pretty clunky, but one of them fooled 5 of the judges (who were chosen to know nothing about computers)….

  7. john riemann soong said,

    September 8, 2009 @ 12:17 am

    The problem with CAPTCHAs is that in some cases humans are worse at them than some computers are.

  8. Spectre-7 said,

    September 8, 2009 @ 12:35 am

    The problem with CAPTCHAs is that in some cases humans are worse at them than some computers are.

    Perhaps the next step will be to implement CAPTCHAs specifically designed to be difficult for humans and easy for computers… without letting the user know, of course. :)

    The whole CAPTCHA situation is pretty interesting when you think about it, though. I suspect it's leading to some impressive advances in visual pattern recognition right now, and I'm curious what other sort of technologies could be driven in a similar manner. That is, leveraging capitalism to advance research that might otherwise be primarily academic. For instance, if the industry switched to natural language questions for CAPTCHAs, might we see advances in AI natural language processing?

  9. Kenny Easwaran said,

    September 8, 2009 @ 12:40 am

    It's easy to make a CAPTCHA that is difficult for humans and easy for computers – just use a simple font, but make the text only 1 grayscale shade darker than the background. Or make the CAPTCHA be the multiplication of two 10 digit numbers, or something similar.

  10. Emily said,

    September 8, 2009 @ 3:13 am

    Saw this: more "No word for X"

    http://cargocollective.com/media/67687/aug21_640.jpg

  11. Achim said,

    September 8, 2009 @ 3:24 am

    @ Spectre-7:

    For instance, if the industry switched to natural language questions for CAPTCHAs, might we see advances in AI natural language processing?

    How about the widespread "What is your mother's maiden name?" type of question? How successful would a script be that turned to sites like geni.com in cases where real names, not just aliases, accounts etc. are available?

  12. mollymooly said,

    September 8, 2009 @ 5:20 am

    According to the Economist "it will be possible for software to break text CAPTCHAs most of the time within five years. A new way to verify that internet users are indeed human will then be needed. But if CAPTCHAs are broken it might not be a bad thing, because it would signal a breakthrough in machine vision that would, for example, make automated book-scanners far more accurate."

    Which might help Google Books. Maybe people will switch to metadata CAPTCHAs…

  13. Aaron Davies said,

    September 8, 2009 @ 6:58 am

    there's already a captcha project based on transcribing two words of a scanned book: http://recaptcha.net/

  14. Aaron Davies said,

    September 8, 2009 @ 7:01 am

    i wonder how long http://vkcouplestesting.com/ is going to stay up. http://wetriffs.com/ spent quite a while down, iirc.

  15. Dan Lufkin said,

    September 8, 2009 @ 4:46 pm

    When the National Weather Service introduced text-to-speech for telephone forecasts in Baltimore in the early 80s, we surveyed callers for a while. The computer consistently beat human operators for "friendliness."

    Maybe that was just Baltimore, though.

RSS feed for comments on this post