I'm in Brighton for InterSpeech 2009, but unfortunately duties in Philadelphia made it impossible for me to make it here in time to act as a human control in the 2009 Loebner Prize competion, the annual administration of the "Turing Test". As the ISCA Secretariat put it,
We are seeking volunteers to pit themselves against the entries — and prove to the judges just how human they are!
The test involves using a computer interface to chat (type messages) for 5 minutes with a judge, who does the same with the program, not knowing which is which. The judge has to determine which is the true human.
It's no accident that the next-to-last xkcd strip dealt with a version of this problem:
[Click on the image for a larger version.]
I've looked around, and asked around, but if the chat logs for this year's competition have been posted, I can't find them.
I hope no hearts out there are broken, but it’s important to know these things. Bots can handle thousands of connections at once, so you don’t know who else your internet partner is chatting with. There’s nothing worse than a Turing Test coming back positive for chlamydia.
[Update -- Shalom Lappin responded by email:
Sorry I missed you at Interspeech. I was one of the judges for the Loebner prize. The contest was organized locally by Philip Jackson of the University of Sussex, and he might be able to provide you with the transcripts of the interactions.
None of the judges had any difficulty in distinguishing human from non-human interlocutors after the first or second turn in the conversation. The two main features which allowed me to identify a human vs. a non-human agent are (i) capacity for fluent domain general discourse marked by frequent and unpredictable changes in topic, (ii) willingness to allow the judge to take over the conversation, (iii) capacity to handle ellipsis, pronouns, and non-sentential fragments, and (iv) typing errors and corrections in human but not program contributions. The relative absence of progress in developing general purpose conversational agents contrast sharply with the substantial progress of the past 10-15 years in task driven, domain specific dialogue management systems and other types of NLP.