Language Log

Legally binding hallucinations

February 24, 2024 @ 7:57 am · Filed by Mark Liberman under Computational linguistics

I missed this story when it happened 10 days ago, and caught up with it yesterday because the BBC also got the word — Maria Yagoda, "Airline held liable for its chatbot giving passenger bad advice – what this means for travellers", BBC 2/23/2024:

In 2022, Air Canada's chatbot promised a discount that wasn't available to passenger Jake Moffatt, who was assured that he could book a full-fare flight for his grandmother's funeral and then apply for a bereavement fare after the fact.

According to a civil-resolutions tribunal decision last Wednesday, when Moffatt applied for the discount, the airline said the chatbot had been wrong – the request needed to be submitted before the flight – and it wouldn't offer the discount. Instead, the airline said the chatbot was a "separate legal entity that is responsible for its own actions". […]

The British Columbia Civil Resolution Tribunal rejected that argument, ruling that Air Canada had to pay Moffatt $812.02 (£642.64) in damages and tribunal fees. "It should be obvious to Air Canada that it is responsible for all the information on its website," read tribunal member Christopher Rivers' written response. "It makes no difference whether the information comes from a static page or a chatbot."

Some other links (among many):
"Moffatt v. Air Canada", Civil Resolution Tribunal, Small Claims Decisions, 2/14/2024.

Katyanna Quach, "Air Canada must pay damages after chatbot lies to grieving passenger about discount", The Register 2/15/2024.
Leyland Cecco, "Air Canada ordered to pay customer who was misled by airline’s chatbot", The Guardian 2/16/2024.
Kyle Melnick, "Air Canada chatbot promised a discount. Now the airline has to pay it.", WaPo 2/18/2024.

This case strikes me as representative of a serious potential difficulty for future monetization of chatbots.

It's one thing to use them to improve (or at least change) web search. No one has ever sued Google for pointing them to a a website that gave false and damaging information — though surely that has happened many billions of times, without thereby disrupting Google's ability to sell targeted advertising. But the most obvious way to monetize chatbots is to use them to replace employees dealing with (real or potential) customers, or to improve crappy menu-driven interaction apps. And if the bots provide hallucinated information, even occasionally, that could be a serious disincentive to use (and pay for) them.

A law professor suggests that

Mr. Moffat was able to how by a preponderance of the evidence that all elements of a claim for negligent misrepresentation were met. The CRT rejected Air Canada's affirmative defense based on the terms and conditions of the applicable tariff because Air Canada described those terms and conditions but did not provide evidence of them. Seems odd that Air Canada would bother to fight this claim but then not bother to provide evidence necessary to its defense. as a result of Air Canada's half-hearted litigation strategy, we can't know whether other plaintiffs could follow in Mr. Moffat's path. It may be that Air Canada has a powerful defense. However, when a big corporation goes up against a pro se litigant, the CRT is not inclined to cut it any slack.
[Jeremy Telman, "Air Canada Bound by Its Chatbot", ContractsProf Blog, 2/20/2024]

So maybe bot owners could bring a more effective argument if the stakes were high enough?

The same author asked DALL-E to create an image of "the Air Canada chatbot pictured the day that it started work", and another "three weeks into its new career":

February 24, 2024 @ 7:57 am · Filed by Mark Liberman under Computational linguistics

Permalink

19 Comments

Charlotte Stinson said,

February 24, 2024 @ 8:06 am

In the US, IIRC, this isn’t complicated. The airline would be liable for the difference between the full fare and the bereavement fare if it pointed the customer to a particular chariot in response , because by doing so the airline implicitly represented to the customer that the chat it would give it accurate information, and the customer’s reliance on that representation was reasonable. This is different from a situation where a customer simply googles the question (unless googling take them to the airline’s website), because in that case the airline did not implicitly represent that the results of googling would be reliable
J.W. Brewer said,

February 24, 2024 @ 10:01 am

I assume part of what's going on here is that the "hallucination" was not obviously "hallucinatory," in the sense that it was not on its face so implausible or bizarre that the customer should not reasonably have believed it accurate without further inquiry. It's one thing if the chatbot quotes a fare of \$250 when the airline intended a fare of $580, but would likely be another thing if the chatbot had said "if you book before midnight, the fare will be thirteen live chickens, payable to the gate agent before boarding."
KevinM said,

February 24, 2024 @ 10:25 am

Inevitably, in comment 1 chatbot autocorrected to “chariot.” “Where’s my chariot” lawsuits may be in our future.
Haamu said,

February 24, 2024 @ 10:35 am

Apparently, DALL-E thinks the chatbot needs liquid assistance to do its job.

Perhaps this is projection. If I had more time this morning, I would ask DALL-E to depict its own level of job satisfaction.

(More linguistically relevant, perhaps — and maybe this is obvious to everyone else — but since we don't have the actual prompts used we can only guess: might the bottles be appearing because of "bot"?)
Haamu said,

February 24, 2024 @ 10:43 am

Reading the linked blog post a little more closely, I see a clue to where the bottles come from: "I asked a chatbot its opinion about what could have caused the negligent misrepresentation in question."

So it probably isn't any "bot"/"bottle" confusion, or maybe that added a fractional probability that tipped things in this direction, but DALL-E was already primed to depict some form of causation.

This suggests a different legal theory for Air Canada's liability: negligent hiring practices. You really need to screen your potential chatbots for alcoholism.
Seth said,

February 24, 2024 @ 11:24 am

I don't think there's any major difficulty for chatbots here. Customer service humans get information wrong all the time. It never makes the news. And airlines fares themselves are notoriously indeterminate – I'm not sure if hallucinations would end up being a net gain or loss overall!
I suppose it would be nice for the airline if they could argue they had no liability from chatbots as opposed to humans, but apparently that trick didn't work.

Regarding "No one has ever sued Google for pointing them to a a website that gave false and damaging information" – this is because (in the US) Google has 100% immediate legal immunity from such lawsuits, no matter how false or damaging. The general legal topic is called "Section 230", and it's a major tech policy debate.
SusanC said,

February 24, 2024 @ 3:18 pm

As an employee of a university, I am under strict instructions not to agree contracts on behalf of the university, because the other party to the contract potentially doesn't know I'm not allowed to do that, and courts might hold such a contract valid. (but, it is clear, I would be in a ton of trouble if I did that),

Replace me with a hallucinating llm, and you might be in for a lot of legal trouble,
Mark Liberman said,

February 24, 2024 @ 3:53 pm

@J.W. Brewer: "I assume part of what's going on here is that the "hallucination" was not obviously "hallucinatory," in the sense that it was not on its face so implausible or bizarre that the customer should not reasonably have believed it accurate without further inquiry. "

Hallucination has become the standard term of art for plausible inventions on the part of LLMs, like non-existent but valid-sounding source references.
Richard Hershberger said,

February 24, 2024 @ 8:18 pm

This outcome was entirely predictable from the start. I know this because my response to the corporate headlong rush to eliminate humans from the loop was that one of these things would produce bullshit (my preferred term over hallucination) that would prove legally binding. How could this not happen, given the technology? My response to this specific story is that Air Canada got off cheap. The only surprising part is that they fought it at all.

As for chatbot search, I am not so sure that it is as legally innocuous as Mark suggests. It is one thing to respond to search terms with a list of web pages that arise from the search. It is quite another for the search engine to produce an answer in its own voice. If Bing AI gives a defamatory answer to a question, why exactly is Microsoft not liable?
Seth said,

February 25, 2024 @ 2:55 am

@ Richard Hershberger – I'm not a lawyer, but I've studied these issues. My layman but educated answer to your question is that the legal theory would be that the defamatory answer came from training data which was not produced by Microsoft, and hence under "Section 230" it has no liability for what Bing does with that data. This is different from the airline case, since presumably the airline did product the fare pricing data used by the chatbot.
Philip Taylor said,

February 25, 2024 @ 5:38 am

Seth — I know nothing about "Section 230" whatsoever, but if the defence were based on the premise that "the defamatory answer came from training data which was not produced by Microsoft", why could the plaintiff not then argue that Microsoft failed to ensure due diligence when selecting their training data ?
Seth said,

February 25, 2024 @ 10:28 am

@ Philip Taylor – Because the defense's rebuttal would be that because of "Section 230", Microsoft has no obligation to apply any diligence whatsoever to the training data, much less anything "due". You're completely at liberty to think that idea is not the way things should be. You wouldn't be alone in that opinion. But the weight of judicial opinion would be against you (so far).
Jonathan Smith said,

February 25, 2024 @ 11:24 am

Surely there is little/no judicial opinion to date that weighs in on "generative AI" products? — this case gets the ball rolling I guess. Let me express extreme doubt that courts will find that corporate entities have "no obligation to apply any diligence whatsoever to training data" and no liability for libelous and otherwise harm-causing output.
Seth said,

February 25, 2024 @ 1:00 pm

@ Jonathan Smith – You're right about the paucity of current rulings on "generative AI". However, there's now quite a large amount on search engines and data-processing. And while of course not completely dispositive, it does provide a guide as to how new AI cases will be argued, and how they are likely to come out. For example, almost all copyright vs AI discussions I've seen appears to be completely ignorant of the "Google Books" case. That was of enormous significance, and the underlying principles seem very analogous to AI. One thing I've found in the whole "Section 230" debate, is that it's so far from people's intuitions about the way things "should" work, that they often don't even believe the legal argument exists.
Jason said,

February 25, 2024 @ 7:30 pm

@Richard Hershberger

I would have assumed that the airline calculated the cost of having to reimburse the odd misled passenger would be greatly outweighed by the savings created by eliminating their customer service personnel. This case suggest they're too cheap even to pay for that.
Nat said,

February 26, 2024 @ 3:09 am

@ Seth
I trust your judgement about the legal defense in the case of harmful search results. But it does seem to be in tension with the defense against copyright infringement by AI. In the matter of copyrights, the AI is supposed to have so transformed the training data, that it produces something genuinely new, analogously to the training of an human artist. The training data loses its identity, in a sense. But in the case of a search result, the AI merely passively transmits the data given to it. One might say that for search results, the AI is transparent to its training data, while in the matter of copyright, it is opaque.
Now, for all I know, this might be a perfectly legitimate stance to take. Perhaps the relevant processing during training is genuinely distinct in the two cases. But if it can be demonstrated that the same kind of processing occurs in both cases, it might cause some trouble for the defense.
Philip Taylor said,

February 26, 2024 @ 4:19 am

Returning to my earlier suggestion that Microsoft's failure to exercise due diligence in its selection of training data might be significant, I now wonder whether the putative "Section 230" defence might fail because Microsoft selected the training data rather than having it pushed to them. I am not suggesting for one second that Microsoft actively selected the training data rather then merely relying on web searches, but they nonetheless actively acquired it rather than passively accepting and publishing it.
Peter Taylor said,

February 26, 2024 @ 8:27 am

For those who are interested in the relevance of Section 230 and the legal theories as to whether it's applicable or not to LLM-generated texts, this Congressional Research Service briefing summarises a few viewpoints.
Seth said,

February 26, 2024 @ 7:28 pm

@Nat – As I've disclaimed, I'm not a lawyer. I could be wrong. Congress could change the law. The Supreme Court could overturn the existing view of the law. I just want people to recognize that there's an existing legal landscape, and it's often counter-intuitive to the results of vague reasoning from general moral principles. Regarding the specific argument, I think I see what you're trying to do, but I believe it's got a subtle mistake, here: "… so transformed the training data …". I think you're mixing up the idea of "transformative use" with other meanings of transformation. Roughly, "transformative use" is doing something with the material that's different from the work itself – but it doesn't require changing text. That is, Google Books search is transformative in that a search engine returning snippets of a book, is doing something that is different from a book, even if the search engine has the complete contents of all the books inside of it. Thus, analogously, in theory an AI wouldn't have do any transforming of data at all to be a "transformative use" under copyright law. In a way, it could be claimed to be returning a collage of snippets.

RSS feed for comments on this post

Legally binding hallucinations

19 Comments

Charlotte Stinson said,

J.W. Brewer said,

KevinM said,

Haamu said,

Haamu said,

Seth said,

SusanC said,

Mark Liberman said,

Richard Hershberger said,

Seth said,

Philip Taylor said,

Seth said,

Jonathan Smith said,

Seth said,

Jason said,

Nat said,

Philip Taylor said,

Peter Taylor said,

Seth said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta