Among the new phrases…

« previous post | next post »

Today's Tank McNamara:

According to the NFL, a "hip drop tackle" "occurs when a defender wraps up a ball carrier and rotates or swivels his hips, unweighting himself and dropping onto ball carrier’s legs during the tackle". And I would have more or less guessed that meaning, before getting the authoritative definition.

A Sandwich Helix, on the other hand…

…seems to have been invented to represent an uninterpretable two-word sequence (though maybe I'm missing something?). The mouseover title for that xkcd strip is "The number one rule of string manipulation is that you’ve got to specify your encodings", which is plausible but doesn't exactly help.

There are some sandwich-helix-type things in the biochemistry literature, FWIW, where in particular helix-heme-helix sandwiches have been discussed at length; but I doubt Randall Munroe had that in mind.

My failed attempt to decode "sandwich helix" reminds me of a recently-proposed AI evaluation — Nicholas Riccardi et al. "The Two Word Test as a semantic benchmark for large language models", 2024:

Abstract: Large language models (LLMs) have shown remarkable abilities recently, including passing advanced professional exams and demanding benchmark tests. This performance has led many to suggest that they are close to achieving humanlike or “true” understanding of language, and even artificial general intelligence (AGI). Here, we provide a new open-source benchmark, the Two Word Test (TWT), that can assess semantic abilities of LLMs using two-word phrases in a task that can be performed relatively easily by humans without advanced training. Combining multiple words into a single concept is a fundamental linguistic and conceptual operation routinely performed by people. The test requires meaningfulness judgments of 1768 noun-noun combinations that have been rated as meaningful (e.g., baby boy) or as having low meaningfulness (e.g., goat sky) by human raters. This novel test differs from existing benchmarks that rely on logical reasoning, inference, puzzle-solving, or domain expertise. We provide versions of the task that probe meaningfulness ratings on a 0–4 scale as well as binary judgments. With both versions, we conducted a series of experiments using the TWT on GPT-4, GPT-3.5, Claude-3-Optus, and Gemini-1-Pro-001. Results demonstrated that, compared to humans, all models performed relatively poorly at rating meaningfulness of these phrases. GPT-3.5-turbo, Gemini-1.0-Pro-001 and GPT-4-turbo were also unable to make binary discriminations between sensible and nonsense phrases, with these models consistently judging nonsensical phrases as making sense. Claude-3-Opus made a substantial improvement in binary discrimination of combinatorial phrases but was still significantly worse than human performance. The TWT can be used to understand and assess the limitations of current LLMs, and potentially improve them. The test also reminds us that caution is warranted in attributing “true” or human-level understanding to LLMs based only on tests that are challenging for humans.

Of course, there are lots of word combinations that are opaque unless you're in the know, like "fog computing".



13 Comments »

  1. Philip Taylor said,

    October 25, 2024 @ 3:24 pm

    "The number one rule of string manipulation is that you’ve got to specify your encodings", which is plausible but doesn't exactly help intriguingly self-referential. With View/Text Encoding set to its self-determined default of Unicode, the characters following "is that you" are lower-case-a-circumflex, euro, superscript trademark; if I force View/Text Encoding to Western, it is even worse, and I see ""The number one rule of string manipulation is that you’ve got to specify your encodings", where "is that you" is followed by upper-case-A-tilde, cent, lower-case-a-circumflex, comma, logical not, lower-case-a-circumflex, comma, comma, cent.

  2. Brett said,

    October 25, 2024 @ 3:32 pm

    As usual, Explain xkcd: It's 'cause you're dumb. has a probable explanation of the allusion (the same one I inferred myself) posted very quickly. The current version (which will doubtless change over the next few days) states:

    The fictitious "Sandwich Helix" plays on another concept in communication, the "Compliment Sandwich", wherein a statement of criticism is sandwiched between two complimentary statements in order to make the negative statement easier to accept. The difference is that the Compliment Sandwich is a communication technique which is well known and whose meaning has not been lost. A possible inspiration for the "helix" part is the Helical Model of Communication. The creator of the model, Frank Dance, emphasised the role of communication problems. He shows communication as a dynamic and non-linear process.

  3. Rick Rubenstein said,

    October 25, 2024 @ 4:04 pm

    My initial parsing of "including passing advanced professional exams and demanding benchmark tests" had me wondering why an LLM would demand a benchmark test.

    Also, the TWT is a very clever idea!

  4. Nat said,

    October 25, 2024 @ 4:31 pm

    An allusion to "compliment sandwich" or really anything else seems extremely unlikely to me, since any kind of allusion undermines the point of the comic: there is no meaning absent context, and with no context, a message is uninterpretable. Or something along those lines.
    I don't quite get the mouse-over text, since I'm not sure what kind of symbol manipulation is intended. But I'd guess it's something like trying to ask what a program does independently of any operating environment, a meaningless question.

  5. D.O. said,

    October 25, 2024 @ 11:33 pm

    This talk about "Sandwich Helix" put me in mind of "helix sandwich", which might be my own invention. Two long thin pieces of bread curving around each other with filling in between them. Of course, the inter-bread ingredients have to be made such that the don't fall out. Or maybe a food for star travelers.

  6. Julian said,

    October 26, 2024 @ 1:32 am

    Editor here
    Is there some academic custom that says abstracts aren't allowed to have paragraph breaks?

  7. davep said,

    October 26, 2024 @ 6:49 am

    “Encoding” analogous to “context”.

    That is, like “helix sandwich” is meaningless without context, strings are meaningless without (knowing) the encoding method.

    The comic is saying one needs to preserve information about (keep in mind) context and encoding.

    I agree with Nat that trying to find meaning in “helix sandwich” is missing tge point.

  8. ohwilleke said,

    October 26, 2024 @ 4:13 pm

    I frequently blog academic paper's with a quote of a full abstract and routinely insert paragraph breaks into them to make them more readable. There is an unstated convention, however, that an abstract is a one paragraph summary of a larger paper. A minority approach puts headings corresponding to the main component of a paper like methods and conclusions, with a sentence or so in each.

  9. Anthea Fleming said,

    October 27, 2024 @ 1:27 am

    D.O.'s helical sandwich sounds as if it is closely related to the Asparagus Roll, which my mother's generation thought was acceptable party food. Alwas, alas, made with tinned asparagus.

  10. Haamu said,

    October 27, 2024 @ 12:04 pm

    One could say that a DNA/RNA [double] helix is a message separated from its context — often literally so, as in the case of crime scenes, buckets in the back of cupboards in Melbourne, and so on.

    Of course, as a possible explanation here, I wouldn't push this too far, other than to note that Randall has had this general topic on his mind recently. (See the previous comic.)

  11. Idran said,

    October 27, 2024 @ 4:35 pm

    The joke in the mouse-over text is entirely unrelated to the joke in the comic panel. The mouse-over text is referring to the common problem of characters getting misparsed because the wrong string encoding gets defined for a webpage. It's doing so by using the incredibly common example of Unicode character U+2019, "Right Single Quotation Mark", which many word processors use for the apostrophe instead of U+0027, "Apostrophe" since the former has a slight right-leaning slant that looks more like the traditional hand-written apostrophe. However, if a webpage's string encoding is set to CP-1252 instead of UTF-8, that character is mistakenly displayed as "’". You used to see this specific misparsing happen all the time on the internet because people would just copy text directly out of word processors and into websites they were building without either replacing the character or setting the string encoding correctly. It's less common today, but it still sometimes happens.

    You can even Google for the specific string "’" and see tons of webpages describing that particular problem and how to fix it.

  12. Idran said,

    October 27, 2024 @ 4:41 pm

    Small correction in hindsight: when I said entirely unrelated, I meant that it wasn't referring to a specific possible interpretation of "Sandwich Helix" or something like that; @davep is right that it's on the same _theme_ as the panel joke.

    Also, as a follow-up note, the reason @Philip Taylor saw what he saw was because Randall was faking the common error by manually entering the string that U+2019 gets misparsed to. :P

  13. Julian said,

    October 27, 2024 @ 8:19 pm

    @ohwillike
    "There is an unstated convention, however, that an abstract is a one paragraph summary…"
    So if you need a 300 word summary you just make it a 300-word paragraph. Got it. Makes perfect sense….
    Funnily enough, just the other day I was reading an article about the pitfalls of teaching to the test.

RSS feed for comments on this post · TrackBack URI

Leave a Comment