Language Log

ROT-LLM?

July 28, 2023 @ 5:37 am · Filed by Mark Liberman under Computational linguistics, Language and culture

There's a puzzling new proposal for watermarking AI-generated text — Alistair Croll, "To Watermark AI, It Needs Its Own Alphabet", Wired 7/27/2023:

We need a way to distinguish things made by humans from things made by algorithms, and we need it very soon. […]

Fortunately, we have a solution waiting in plain sight. […]

If the companies who pledged to watermark AI content at the point of origin do so using Unicode—essentially giving AI its own character set—we’ll have a ready-made, fine-grained AI watermark that works across all devices, platforms, operating systems, and websites.

What's proposed here is a character-for-character substitution — like ROT13 encryption but using character codes that are digitally different while being visually the same. As Croll explains:

In Unicode, every character has a number. The Latin Capital Letter A, for example, is hexadecimal number 41. But there are plenty of other A’s in Unicode: There’s Fullwidth Latin Capital Letter A (Ａ, number EF BC A1), Mathematical Bold Capital A (, number F0 9D 90 80), Mathematical Sans-Serif Capital A (, F0 9D 96 A0), and plenty of others. Each A has its own name, its own Unicode value, and in some cases, its own font shape. Why not create a letter A just for AI?

If the AI-specific character sets were created — and we'd need many of them, to support all the world's writing systems — then the watermarking process would be a trivial computer program.

Of course, de-watermarking would be an equally trivial program, so what's the point?

Croll's answer:

It’s important to note that this proposed markup is not an enforcement mechanism. Bad actors could easily convert AI text to look like it was written by a human. A recipient still needs to trust a sender in order to believe what is marked up. But that’s one of the strengths of this approach. Once text is marked, a human has to actively remove the AI marker at some stage between the LLM and the consumer. We have legal mechanisms to investigate and deal with negligence or wrongdoing. The proposed protocol simply lets us apply these to AI.

This really puzzles me. It assumes that a student (for example) who's willing to use an LLM as a ghost-writer, despite this being against the rules, will draw the line at running a trivial character-substitution program to disguise their violation.

July 28, 2023 @ 5:37 am · Filed by Mark Liberman under Computational linguistics, Language and culture

Permalink

22 Comments

Michael P said,

July 28, 2023 @ 6:06 am

What puzzles me is why anyone would identify Unicode characters or code points with their UTF-8 byte sequences rather than their direct values, or call these sequences numbers. The UTF-8 sequence is not a number; it interleaves extra bits for the convenience of other software.

For example, Fullwidth Latin Capital Letter A is usually identified as U+FF21 (https://unicodeplus.com/U+FF21), and most ways to represent it look more like FF21 than EF BC A1.

Writing "In Unicode, every character has a number" suggests unfamiliarity with Unicode because the Unicode standard is very clear that there is not a N-to-M correspondence between code points and characters for any fixed N or M: is there really a difference between A, Ａ and as characters? Is Å the same character with a diacritic or a different character? Does adding a combining diacritical mark change the character or its "number"?

Similarly with writing "Fullwidth Latin Capital Letter A (Ａ, number EF BC A1)", both because "number" is not the correct term and because UTF-8 encodings are very unusual to write out in such a context.
Mark Liberman said,

July 28, 2023 @ 6:12 am

@Michael P: The UTF-8 sequence is not a number; it interleaves extra bits for the convenience of other software.

Indeed, and similarly for other encodings. But this doesn't really change the nature of the idea, or the apparent problem with it.
Carl said,

July 28, 2023 @ 6:59 am

Forget about kids, spammers aren’t going to leave the text encoded if it means Google won’t send them ad money for scam pages.
mkvf said,

July 28, 2023 @ 7:15 am

I think that's the point. Watermarking won't solve the problem of AI use in individual documents, that need to be checked by a human examiner or moderator. But it might introduce enough extra steps to make it harder to generate millions of apparently unique pieces of text that can pass machine moderation, on social media, search rankings, and spam blockers
Mark Liberman said,

July 28, 2023 @ 7:27 am

@mkvf: But it might introduce enough extra steps to make it harder to generate millions of apparently unique pieces of text that can pass machine moderation, on social media, search rankings, and spam blockers

The "extra steps" would just be one command line, or something equally simple inside a program:

fix LLMoutput fixedLLMoutput

…where "fix" would just be the substitution of the elements of one set of character codes for the corresponding elements of another set. Which is a problem that's almost too easy to use as an assignment in a beginning course on programming.
Gregory Kusnick said,

July 28, 2023 @ 9:50 am

This is a dumb idea for all sorts of reasons, even leaving aside the fact that it's trivial to circumvent.

If I use an AI assistant to generate executable code or scripts, do I need new versions of all my compilers and command-line tools to convert the AI-generated text to traditional machine-readable formats?

If I use AI to produce a rough draft of an article that I then revise, are readers really going to care which specific words came through unedited? The important thing is that it's my byline on the article and I therefore implicitly vouch for the entirety of its content.
mkvf said,

July 28, 2023 @ 9:52 am

Hmmm. Just a line like that to switch between the specific version? Or any watermark? Could the text use random visually equivalent but distinct characters, rather than the same one each time? Or would that be equally easy to reverse? I'm guessing it would, you'd just say for 'Any of these eight versions of 'A", replace it with 'A'".

Which does then, I guess, make it useless for mass checks. I suppose anything that a human can easily see to show a text is AI generated, a computer can 'see' and replace.

—

But then I read the article properly, and they do say this is really aimed at good faith actors. So students could use generated text in essays, clearly, alongside original material, if there school allowed it.

Which seems to add little to just asking people to quote it as you would any source.
Mark Liberman said,

July 28, 2023 @ 12:28 pm

@mkvf: Just a line like that to switch between the specific version? Or any watermark?

Encoding or decoding a "substitution cypher" (like this proposal) is uniquely trivial. It's just a character-by-character lookup table. One easy one-liner could use the unix program tr.

Other proposed text watermarking methods are much more complicated to implement, and the efficacy of attacks is unclear — see my earlier post.
MattF said,

July 28, 2023 @ 12:31 pm

It’s like the proposal to have a ‘spam’ bit, or maybe a ‘sarcasm’ bit in data headers.
Jonathan Smith said,

July 28, 2023 @ 4:02 pm

Surely any solution to this problem should be unified across "AI" generated text, images, sound…? Why not, no live feedback at all via these tools — requests are delivered to say email addresses such that there is a data trail and auditing is (at least in theory) possible.
Thomas Hutcheson said,

July 28, 2023 @ 5:37 pm

Presumably it creates a brighter line for unauthorized use of LLM output.
James Wimberley said,

July 29, 2023 @ 6:37 am

Gregory Kusnick: "The important thing is that it's my byline on the article and I therefore implicitly vouch for the entirety of its content."
Including the headline? Maybe it's different for scholarly articles, but newspapers stick with the outmoded convention, a hangover from the days of manual typesetting, that the editor, not the reporter, is responsible for the headline.
Linda Seebach said,

July 29, 2023 @ 3:23 pm

@wimberley – Newspapers delegate the writing of headlines to the people who lay out the pages because not until then does anyone know how long the headline can be. It is not because they are 'sticking to an outmoded convention.'

Moreover, at newspapers owned by a chain. the layout people often work at regional centers serving many local papers, and have only the text to go on when writing a headline to fit. They have no contact with anyone who wrote or edited the story.
Alistair Croll said,

July 31, 2023 @ 3:20 pm

Hi, folks! Found this thread through a Google Alert, so I figured I'd chime in here.

A few points:
– I'm not a Unicode expert, though I have worked in character sets across a couple of product management jobs.
– This article is an op/ed that has been edited for understandable analogy rather than precision. I have submitted a more detailed piece to ArXiV, which will hopefully get published soon, and was intended to accompany this piece. So analogies like "track changes for AI" or "a highlighter" are just that—imperfect analogies to get the idea across (love the ROT-LLM, which gives me flashbacks to my BBS days.)
– I tend to have strong convictions loosely held. In this case, my conviction is that "humans should have a way to distinguish content generated by nondeterminstic software—that is, generative AI—from that written by humans, either directly or through deterministic software following explicit human instruction, such as code."
– Here, I'm proposing that we make the character itself inseparable from the marker, and the only way to do this (without bracketed metadata, etc.) is a value that's intrinsic to the character, i.e. a new character.
– We might not need new character sets. Font systems by the major OS vendors, and libraries, could simply use the glyph for human-A when displaying AI-A. This would be an OS update. The idea is that a toggle ("reveal AI," similar to "reveal codes") could show the difference.
– A few folks, including some of you, seem to think this is a form of enforcement. It's not. It's an identifier. If you want enforcement, digitally hash/sign the marked text to reveal tampering. If someone subsequently edits the text, the signature is invalid but the markings persist. They can re-sign if they are testifying that the markings are true, creating a chain-of-custody model.
– There are plenty of "good actor" use cases here: We allow students to use calculators, but they must show their work. Similarly, they can use LLMs, but must reveal which parts they wrote. This is similar to footnoting citations, but far simpler.
– It's effectively an "opt-out" voluntary marker that LLMs would insert when generating text.
– There's plenty of work to do (making search work; assistive readers reading the new characters; putting LLM output into interpreters, etc.) Many of these were mentioned in the article.

To address this specifically:
"a student (for example) who's willing to use an LLM as a ghost-writer, despite this being against the rules, will draw the line at running a trivial character-substitution program to disguise their violation"

Many schools are already allowing students to use LLMs for schoolwork, just as they do search engines. They will have to follow guidelines for that. Character substitution is trivial, and bad actors will undoubtedly use it to strip AI markings (they could also just retype it, or screenshot the text and use a character reader to make a human-coded version.) The point is that if the professor has reason to suspect LLMs, as they would when suspecting plagiarism, they can investigate. There is now a "point of wrongdoing" that our justice systems can follow up on.

A second example is a brand or government pledging to be "AI transparent." Personally, I'd like to know what messages that I receive were from a human and what were from an LLM. I believe this should be a human right (see convictions, above.) I would give my business to brands that are transparent about their use of generative AI—but that's my position, and this proposal will let the free market sort out how it should be handled.

As for the Unicode numbering, I just pasted the character information from my Mac's Emoji & Symbols app, because that's a widely available tool. In the original article I'd included an image that contained all the information about a character, but that died on the editing room floor; it's explained in much more detail in the ArXiV paper.

Having been in the middle of a number of conversations about the idea over the last week, I have one other thing to share:

I’m a bit frustrated with the current human propensity to say “if this isn’t exactly what we need we shouldn’t do it.” Most progress throughout history has been combinatorial or incremental, not punctuated evolution. (See: TRIZ; James Burke’s Connections series.) But since the advent of social media where we get points for critiquing an imperfect- or partial-but-useful solution, it feels like humanity has abdicated its agency to big tech.

Anyway, thanks for debating the idea a bit.
Taylor, Philip said,

July 31, 2023 @ 5:31 pm

« I’m a bit frustrated with the current human propensity to say “if this isn’t exactly what we need we shouldn’t do it.” » — It is not a new phenomenon. When I was at the sewcond of my three London colleges many many years ago, we would routinely make suggestions to our computer supplier (who was also the computer manufacturer/developer/etc) as to how their software could be improved. The response was always the same — "What you suggest is an excellent idea, but what is really needed is […]. Unfortunately […] is far too complex/difficult/expensive/w-h-y, so we won't do anything at all".
Chas Belov said,

August 1, 2023 @ 11:35 pm

The problem with substituting look-alike characters in place of the actual characters is that it will make the content sound like gibberish to screen readers, rendering the content inaccessible to blind people and others who depend on screen readers.

For instance, if I replace the word "cat" with "" then the listener will hear "mathematical bold italic small c mathematical monospace small a mathematical sans serif small t."
Chas Belov said,

August 1, 2023 @ 11:36 pm

Hmmm, and apparently such characters will also be stripped from Language Log comments and who knows where else.
Chas Belov said,

August 1, 2023 @ 11:42 pm

That said, I suppose one could do the substitution and provide a button preceding the text to translate it into standard, speakable text, which would also call attention to the fact that it was watermarked. I would want it to precede and not follow the text because we would (hopefully) never want to subject listeners to gibberish.
Taylor, Philip said,

August 2, 2023 @ 4:57 am

[Just testing] $$. If nothing (apart from a space, a period and this sentence) appears after the closing square bracket and space, the test failed — I was trying to embed Chas' mathematical characters within maths delimiters.
Chas Belov said,

August 3, 2023 @ 2:38 am

Also testing:
العربية 繁體中文 Filipino Français 日本語 한국어 Русский Español ภาษาไทย Tiếng Việt
Chas Belov said,

August 3, 2023 @ 2:39 am

Okay, so the problem appears not to be that it's UTF-8. It appears to be that it's an excluded or un-permitted character range.
Tony DeSimone said,

August 12, 2023 @ 7:18 am

Everything old is new again. https://www.ietf.org/rfc/rfc3514.txt

Network Working Group S. Bellovin
Request for Comments: 3514 AT&T Labs Research
Category: Informational 1 April 2003

The Security Flag in the IPv4 Header

Currently-assigned values are defined as follows:

0x0 If the bit is set to 0, the packet has no evil intent. Hosts,
network elements, etc., SHOULD assume that the packet is
harmless, and SHOULD NOT take any defensive measures. (We note
that this part of the spec is already implemented by many common
desktop operating systems.)

Except Croll seems to be serious

RSS feed for comments on this post

ROT-LLM?

22 Comments

Michael P said,

Mark Liberman said,

Carl said,

mkvf said,

Mark Liberman said,

Gregory Kusnick said,

mkvf said,

Mark Liberman said,

MattF said,

Jonathan Smith said,

Thomas Hutcheson said,

James Wimberley said,

Linda Seebach said,

Alistair Croll said,

Taylor, Philip said,

Chas Belov said,

Chas Belov said,

Chas Belov said,

Taylor, Philip said,

Chas Belov said,

Chas Belov said,

Tony DeSimone said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta