Invisible text via Unicode tag characters

« previous post | next post »

If you open this file in your browser, you'll see only an an left square bracket followed by a right square bracket, with nothing in between:

But if I run the file through a perl script that I wrote long ago to print out character codes and their names, I get

|[| 0x005B "LEFT SQUARE BRACKET"
|| 0xE004C "TAG LATIN CAPITAL LETTER L"
|| 0xE0061 "TAG LATIN SMALL LETTER A"
|| 0xE006E "TAG LATIN SMALL LETTER N"
|| 0xE0067 "TAG LATIN SMALL LETTER G"
|| 0xE0075 "TAG LATIN SMALL LETTER U"
|| 0xE0061 "TAG LATIN SMALL LETTER A"
|| 0xE0067 "TAG LATIN SMALL LETTER G"
|| 0xE0065 "TAG LATIN SMALL LETTER E"
|| 0xE0020 "TAG SPACE"
|| 0xE004C "TAG LATIN CAPITAL LETTER L"
|| 0xE006F "TAG LATIN SMALL LETTER O"
|| 0xE0067 "TAG LATIN SMALL LETTER G"
|]| 0x005D "RIGHT SQUARE BRACKET"

Or you can cut and paste the bracketed sequence into the ASCII Smuggler page, and you'll see this:

For an explanation, see Johann Rehberger's blog post "ASCII Smuggler Tool: Crafting Invisible Text and Decoding Hidden Codes", Embrace The Red 1/14/2024, which starts:

A few days ago Riley Goodside posted about an interesting discovery on how an LLM prompt injection can happen via invisible instructions in pasted text. This works by using a special set of Unicode code points from the Tags Unicode Block.

The proof-of-concept showed how a simple text contained invisible instructions that caused ChatGPT to invoke DALL-E to create an image.

The meaning of these “Tags” seems to have gone through quite some churn, from language tags to eventually being repurposed for some emojis. […]

The Tags Unicode Block mirrors ASCII and because it is often not rendered in the UI, the special text remains unnoticable to users… but LLMs interpret such text.

It appears that training data contained such characters and now tokenizers can deal with them!

For more recent coverage, see Dan Goodin, "Invisible text that AI chatbots understand and humans can’t? Yep, it’s a thing.", Ars Technica 10/14/2024, which explains Rehberger's "Copirate" hack:

What if there was a way to sneak malicious instructions into Claude, Copilot, or other top-name AI chatbots and get confidential data out of them by using characters large language models can recognize and their human users can’t? As it turns out, there was—and in some cases still is.

The invisible characters, the result of a quirk in the Unicode text encoding standard, create an ideal covert channel that can make it easier for attackers to conceal malicious payloads fed into an LLM. The hidden text can similarly obfuscate the exfiltration of passwords, financial information, or other secrets out of the same AI-powered bots. Because the hidden text can be combined with normal text, users can unwittingly paste it into prompts. The secret content can also be appended to visible text in chatbot output. […]

“The fact that GPT 4.0 and Claude Opus were able to really understand those invisible tags was really mind-blowing to me and made the whole AI security space much more interesting,” Joseph Thacker, an independent researcher and AI engineer at Appomni, said in an interview. “The idea that they can be completely invisible in all browsers but still readable by large language models makes [attacks] much more feasible in just about every area.”

Microsoft has mitigated the threat to Copilot, according to Ravie Lakshmanan, "Microsoft Fixes ASCII Smuggling Flaw That Enabled Data Theft from Microsoft 365 Copilot", The Hacker News 8/27/2024. I'm not sure where things stand with ChatGPT, Claude, Gemini, and so on. As far as I can tell, WordPress removes tag characters from the text that it stores and presents.

For more details about the Copilot exploit, see this video:

Update — FWIW, here's the contents of the Unicode Tag block, which as you can see is basically a ghost echo of the old ASCII character set:



6 Comments »

  1. John from Cincinnati said,

    October 15, 2024 @ 6:38 am

    Indeed! I opened your text file in the Chrome browser and selected Inspect. Under Rendered Fonts it reports Font Origin: Local file (14 glyphs), which is what your perl script disclosed. But Inspect displays only three characters: the brackets and a space. I think I'm concerned.

  2. Charles Hallinan said,

    October 15, 2024 @ 11:20 am

    When I open the linked file in a new tab in Chrome, I get the brackets on each end of a string of characters divided, for the most part, into groups of three with each group appearing to correspond to one of the letters in "LANGUAGE LOG":

    [ó Œó ¡ó ®ó §ó µó ¡ó §ó ¥ó € ó Œó ¯ó §]

    (The character that appears as a box above displays as the boxed-question-mark/unrecognized character in Chrome & as a blank space in Notepad.)

  3. Yuval said,

    October 15, 2024 @ 11:58 am

    There's somewhat of a full circle completed here, in that now we're surprised when our machines behave like machines rather than as humans…

  4. Mark Liberman said,

    October 15, 2024 @ 11:59 am

    @Charles Hallinan: "When I open the linked file in a new tab in Chrome, I get the brackets on each end of a string of characters divided, for the most part, into groups of three with each group appearing to correspond to one of the letters in "LANGUAGE LOG"":

    Interesting. What machine, OS, Chrome version, etc.?

    Or maybe there's a difference in settings?

    The official treatment of the Unicode Tag characters is to display nothing — for which there have been various motivations proposed and withdrawn over the years. But it would be safer (and probably better all around) for browsers and other apps to display the tag characters as gibberish…

  5. David L said,

    October 15, 2024 @ 12:34 pm

    I get the same thing Charles Hallinan sees. Very old Dell PC running Chrome Version 129.0.6668.100 (Official Build) (64-bit)

  6. Y said,

    October 15, 2024 @ 12:51 pm

    Charles Hallinan, David L.: When Windows (at least the versions you have) sees a plain text file, it has to guess the encoding, i.e. the mapping from numerical bytes to characters. It guesses the default Windows Latin 1 encoding instead of the correct UTF-8 encoding, and out comes this garbled result.

RSS feed for comments on this post · TrackBack URI

Leave a Comment