Visual mondegreen?

« previous post | next post »

[This is a guest post by Stephen Plant]

I came across 'connorant' the other day, as in “gannets, connorants, vultures” in Ulysses. It was on the Guardian website. In my Penguin copy of Ulysses (p 526) it's spelt 'cormorant' (perhaps editions differ?). There are a surprising number of references to 'connorant' on line. I suppose the Ulysses connorants have a common ancestor, but the word connorant crops up in scientific journals too — here and here.

I imagine mistakes such as this were quite common in the days of manuscripts, but my (admittedly amateurish) research has failed to find a grammatical term for this specific form of mistranscription.

Do you happen to know of such a term?

If there isn't one (and assuming there's a demand!) I would like to propose 'connorant'.

 

Selected readings



23 Comments

  1. Philip Taylor said,

    May 23, 2020 @ 5:43 am

    In some fonts (and perhaps in some authors' handwriting) the letter pair "rn" and the single letter "m" are virtually indistinguishable, especially at small size, and I wonder whether "connorant" arose because the reader, previously unfamiliar with the word, simply mis-read the letters therein.

  2. Dick Margulis said,

    May 23, 2020 @ 5:53 am

    A great many backlist books from major publishers are scanned images of earlier, typeset editions for which no digital files exist. It is common to find scannos (there are some variant spellings, but the word is a play on typo) such as the example given. Often, these result from keming, a 2008 coinage by David Friedman (http://www.ironicsans.com/2008/02/idea_a_new_typography_term.html) that has gained traction in the typography world.

    Bottom line, there's already a name for the phenomenon.

  3. Victor Mair said,

    May 23, 2020 @ 6:49 am

    kerning –> keming

    cormorant –> connorant

    The latter may be a scanno, but is it really a keming? Look carefully.

    If it's only a scanno, not a keming, then we still need a name for it, since scannos result from all sorts of other visual misrecognition, not just kemings.

  4. Chips Mackinolty said,

    May 23, 2020 @ 6:53 am

    Scanorant?

  5. Marco said,

    May 23, 2020 @ 6:58 am

    Another very common "scanno" that is, in an E-Book, a clear indication that it has been re-scanned from print material rather than produced from original, digital input, is "modern" => "modem".

  6. mollymooly said,

    May 23, 2020 @ 7:08 am

    minim confusion is in the dictionary

  7. Dan said,

    May 23, 2020 @ 7:28 am

    “Scarmo”? :-)

  8. Dick Margulis said,

    May 23, 2020 @ 7:39 am

    @Victor: I don't understand the distinction you are making. Well, I see that you are actually counting stems and arches and not finding the identical numbers, but that's not the issue here. The issue is that when you're driving past a picket fence at thirty miles an hour, it's hard to count the pickets. Reading at a normal speed for fluent readers is a lot like that. We mostly slow down, if we do so at all, only for unfamiliar words. The bulk of the reading we do (can't speak for Asian scripts, but at least for the Latin alphabet) consists of recognizing the shapes of words and predicting the content of that shape based on the preceding words (much the way predictive text on your phone works), slowing down now and then to check that we have it right. This is the basis for the memes that circulated a few years ago with text that was jumbled seemingly (but not actually) at random and that flattered your intelligence if you could make out the intended text.

    The combination conn is more common than the combination corm, and so it's what one would expect. Throw in the noise inherent in a scanned image of what may not have been particularly good typography and printing in the first place, and cormorant –> connorant is an easy error.

    If you recall the arrangement of Webster's 2nd International Unabridged, it had, at the foot of each page, set off by a horizontal rule, a number of minor entries that didn't warrant inclusion on the main page. One such was fnese (alphabetized in the F's), defined as an obsolescent spelling of sneeze. Well, there was no doubt a citation for it somewhere, but upon investigation the cited text was a typo for ſnese (initial long S) in the first place. The entry was dropped from the Third International. My point is just that errors exist, and we should call them what they are. Just because something is rendered once or a dozen times in text by some ignorant twit is not a reason to perpetuate it as a valid variant. If it catches on, fine. Language evolves. But if it's just a mistake that anyone would want to correct if it be pointed out to them, would it be okay to just toss it into a bucket with related errors (the keming bucket in this case) and not invent a whole new taxon?

  9. Victor Mair said,

    May 23, 2020 @ 8:00 am

    You're talking about perception. I'm talking about construction, which is what the kerning –> keming mistranscription is all about. As someone who has spent his life paying careful attention to the construction of thousands upon thousands of Chinese characters, which often differ only in tiny details, for me the difference between perception and construction is crucial.

  10. ycx said,

    May 23, 2020 @ 8:20 am

    I agree with Dick Margulis that keming and OCR of poor quality scans is likely the cause of both cases of "connorant". There is even a Reddit forum, or "subreddit" /r/keming dedicated to the concept.

    The fans were probably copying the text off a PDF which was automatically OCRed from scans, which had the error, and one look at the source scan for the academic article shows the quality as being extremely low, which likely contributed to the misspelling.

  11. ycx said,

    May 23, 2020 @ 8:29 am

    @Victor I think the issue at hand is that "scannos" are caused by both perception *and* construction, making keming a significant contributory factor, but not necessarily the only cause.

    Modern OCR programs use neural-network and machine learning techniques which are trained on human perception of a corpus of scanned texts, which means that they are likely to follow human perception to a significant degree.

    At the same time, construction of the characters also contributes to the issue, since a well-scanned or well-kerned typeface would make it far less likely for the transcription error to occur while scanning.

    I think "scanno" is a reasonable name to describe the concept of errors introduced during the process of scanning and OCR.

  12. David Douglas ROBERTSON said,

    May 23, 2020 @ 9:44 am

    Ah, the old "dot corn" problem!

  13. Robbie said,

    May 23, 2020 @ 3:23 pm

    The Guardian has long been notorious for typos, hence its nickname of the Grauniad.

    In the QI Forums, we've coined the term "youlgreave" for when you misread something and it's funny. We've found it very useful for years and are trying to spread it into the wild, so everyone here is welcome to use it!

  14. ktschwarz said,

    May 23, 2020 @ 5:11 pm

    @Jonathan Silk: Many people would call that a dangling modifier. Arnold Zwicky has a more specific term, "SPAR (a Subjectless Predicative Adjunct Requiring a referent for the missing subject)", and has written here and on his own blog about how SPARs are often quite clear and sensible in context, particularly if they start with "As". Here's his post on the "As a …" construction.

    If you want everything explicit, you might rephrase the sentence as "As someone who has spent his life etc., I know that the difference between perception and construction is crucial." (Now I'm wondering if there are languages that would grammaticalize that, putting the "As" clause in an evidential mood.)

  15. Julian said,

    May 23, 2020 @ 6:40 pm

    How to OCR programs handle Russian, compared with English? I find printed Russian troublesome in this regard. Most words end up looking like a row of pickets with a few tiny crossbars scattered here and there.

  16. Vance Koven said,

    May 23, 2020 @ 8:36 pm

    In my mind's ear I can hear "connorant" as a potential Irish pronunciation of cormorant. Maybe Joyce was dictating the manuscript?

  17. Peter Taylor said,

    May 24, 2020 @ 4:46 am

    @Vance Koven, Joyce's manuscripts were actual manuscripts. I found a scanned manuscript of the Circe chapter on the National Library of Ireland's website, but the lengthy stage directions which mention various species of birds appear to be completely absent, so I suppose this must be a draft and they were added in a later revision.

  18. Stephen Hart said,

    May 24, 2020 @ 11:25 am

    I also vote for an OCR error.
    I'm reading Chandler's The Simple Art of Murder as an ebook, and it has many similar errors, often involving r and n and m.

  19. Howard said,

    May 24, 2020 @ 8:19 pm

    Has no one considered the possibility that the connerant in a bird in an alternate world, say one envisioned by China Miéville? Perhaps the Keming is still on the throne.

  20. Stephen Plant said,

    May 25, 2020 @ 6:42 am

    If nothing else 'connorant' has provoked an interesting discussion! Many thanks to all those that contributed.

  21. Dara Connolly said,

    May 25, 2020 @ 8:57 am

    Some hilarious examples here:
    https://www.theguardian.com/books/booksblog/2014/may/01/scanner-ebook-arms-anus-optical-character-recognition

    Apologies if this arms/anus confusion has previously been highlighted on Language Log – a quick search didn't find anything.

  22. Francois Lang said,

    May 26, 2020 @ 9:25 am

    Same thing happens in comic-book word balloons, which are typically in FULL CAPS.

    L next to I often looks like U, which is why two completely innocuous words "Clint" and
    "flick" can't be used.

    https://en.wikipedia.org/wiki/CLiNT

  23. Leo B said,

    June 2, 2020 @ 5:51 pm

    @Julian Also a zoology term, "зублефар", а mis-scanning of эублефар (Leopard gecko, Eublepharis) when one of the Russian dictionaries or encyclopedias was digitized in the late 1990s or early 2000s, was a meme for a while. Another reason for the confusion that Э/З can be a typo as well, the two keys are quite close on the Russian keyboard layout.

RSS feed for comments on this post