"It was as if a light had been Nookd…"

« previous post | next post »

Here on Language Log we've often talked about unfortunate search-and-replace miscorrections, which now seem to be infecting poorly edited e-reader texts. The latest example, via Kendra Albert on Jonathan Zittrain's Future of the Internet blog, is a doozy. The Nook edition of Tolstoy's War and Peace (in its English translation) has been de-Kindled, quite literally. Every instance of the text string kindle has been replaced by Nook.


(Click to embiggen.)

Albert explains:

A company called Superior Formatting Publishing offers a $.99 version of the now-public-domain War and Peace through Barnes and Noble’s Nook store — the lowest price version to be found there. When a blogger named Philip of the Ocracoke Island Journal read his copy, he noticed something quite odd:

“As I was reading, I came across this sentence: ‘It was as if a light had been Nookd in a carved and painted lantern….’ Thinking this was simply a glitch in the software, I ignored the intrusive word and continued reading. Some pages later I encountered the rogue word again. With my third encounter I decided to retrieve my hard cover book and find the original (well, the translated) text.

For the sentence above I discovered this genuine translation: ‘It was as if a light had been kindled in a carved and painted lantern….’ “

The Nook version of War and Peace had changed every instance of “kindle” or “kindled” into “Nook” and “Nookd,” not just on Philip’s copy, but on ours too.

The Superior Formatting Publishing version isn’t a Barnes and Noble book, so this isn’t the work of a rogue Nook marketer from B&N.  Rather, it’s likely that Superior Formatting Publishing ported its Kindle version of War and Peace over to the Nook — doing a search and replace to make sure that any Kindle references they’d inserted, such as in the advertising at the end of the book about their fine Kindle products, were simply changed to Nook.

It's always dangerous to enforce an editorial guideline by means of a global search-and-replace, whether that guideline leads you to replace queen with Queen Elizabeth, gay with homosexual, or Cronkite with Mr. Cronkite. (And please don't try to convert 50 Cent into foreign currency.) I have a feeling we're going to be seeing a lot more of these howlers, as e-publishers look to make a quick buck off of repackaging public-domain literary classics but don't bother with the niceties of copy-editing before making texts available electronically.

[Or as Theo Vosse puts it in the comments below, "Don't touch the clbuttics."]

(H/t, Greg Howard.)



23 Comments

  1. Hein Donix said,

    June 2, 2012 @ 2:35 am

    Just about a month ago, there was a similar case in Germany; the Berliner Morgenpost (Berlin Morning Post) and the WELT (lit.: world) shared quite a lot of their online edition's articles. However, these articles weren't completely identical, but the string "Welt" was replaced by "Morgenpost Online" in the Morning post versions leading to "Morgenpost Onlinerekorden" (Morning post online records) and "Morgenpost Onlineweiter Produktion" (Morning post online wide production).
    Uncovered by BILDblog, a famous german watchblog, this mistake caused a meme (german) of replacing "world" with "Morning Post online" in well-known phrases: "What a wonderful Morning Post online", "The Morning Post online is not enough" , "That's not the end of the Morning Post online", etc.

  2. Simon Tatham said,

    June 2, 2012 @ 3:10 am

    My favourite one of these was noticing the unusual word "htmlirin" in a news article once. After some headscratching I realised this must have been "aspirin" originally, and had to have been caused by a change of web server technology which turned all the filenames ending in .asp into ones ending in .html – and it didn't occur to them to ensure the search-and-replace that corrected crosslinks didn't also operate on the text outside the HTML tags.

    Having worked that out, I discovered that if you google you can find pages containing a lot of other curious words from this family, such as rhtmlberry and tehtmloon.

  3. Bruce Stephens said,

    June 2, 2012 @ 5:00 am

    Presuming the book has been mangled as described, it seems obvious that somebody screwed up while trying to dekindle it.

    I'm surprised that anybody thought it was worth trying. I wonder where they got a text that they though might have mentions of Kindle that could usefully be changed.

  4. Theo Vosse said,

    June 2, 2012 @ 5:33 am

    Don't touch the clbuttics.
    [rimshot]

  5. Dick Margulis said,

    June 2, 2012 @ 5:34 am

    "I have a feeling we're going to be seeing a lot more of these howlers, as e-publishers look to make a quick buck off of repackaging public-domain literary classics but don't bother with the niceties of copy-editing before making texts available electronically."

    I believe this may be the first time anything positive has been said on Language Log, unprompted by a whiny comment, about the craft of the lowly copyeditor. The apocalypse is truly upon us.

  6. Faldone said,

    June 2, 2012 @ 6:55 am

    In a former lifetime one of my job responsibilities was copy-editing engineering reports. I would frequently find "login" used for the verb sense "log in". I couldn't do a blanket find-and-replace since "login" was perfectly valid in a noun sense and, more importantly, as a string in text to be used in a computer program. I had to go in and look at each individual instance to decide whether it should be changed. I can't imagine having to do that with something like "War and Peace."

  7. GeorgeW said,

    June 2, 2012 @ 7:10 am

    We may soon be finding 'in every kindle and cranny' in Amazon offerings.

  8. Theodore said,

    June 2, 2012 @ 8:40 am

    With respect to War and Peace in particular, there's of course a completely free EPUB (i.e. nook-compatible) version on project Gutenberg. I guess the $0.99 is for the convenience of obtaining it through the BN store.

  9. Henning Makholm said,

    June 2, 2012 @ 9:25 am

    Text editors typically have "replace all" commands that work in one of two ways: Either they replace every occurrence of the search string anywhere in the document, or they replace every occurrence found between the current cursor position and the end of the document.

    This matches well with the hypothesis that the intent wast to rewrite the advertising at the end of the book about their fine Kindle products. Probably the person doing the search-and-replace thought his program worked on a "from here to the end" basis but actually got the "everywhere" semantics.

    This problem is exacerbated by the fact that some text editors (of either kind) don't bother to jump to the last instance replaced after the operation finished, which would have alerted the user to the over-application.

  10. Dave said,

    June 2, 2012 @ 9:28 am

    @Simon: Thanks very much—this is fun!

    I'm enjoying reading about a dead person's hand still clhtmling the gun, about lovers grhtmling each other, and about Jews living in the Dihtmlora. I even found a record of you finding Rhtmlutin!

    Unfortunately, nobody seems to be concerned with the problem of blhtmlhemy these days.

  11. Emily said,

    June 2, 2012 @ 10:33 am

    Google also gets results for "htmlerger syndrome" and "htmliration."
    Sadly, there are none for "htmlirated consonants" or "Cleopatra was bitten by an html"… yet.

  12. Rachael said,

    June 2, 2012 @ 12:11 pm

    There's a certain series of parenting books which are popular in the US, which have been superficially localised into British English by search-and-replacing "diaper" with "nappy", "crib" with "cot", etc. So it advises you to help your child's language development by "descoting" things to them.

  13. Richard Sabey said,

    June 2, 2012 @ 1:04 pm

    @Henning Makholm Not only that, any decent search-and-replace can be made to search case-sensitively, and to search for the given search string as a word (rather than as a string anywhere in the text, e.g. in mid-word). Thus the Nook and html cock-ups betray particularly inept use of search-and-replace: searches for the specific words Kindle and asp (case-sensitive w.r.t. Kindle) would not have caused so much damage.

  14. John said,

    June 2, 2012 @ 2:20 pm

    Honestly, does nobody know how to put spaces around their replace queries? " Kindle " does not match "kindled", and ".asp " does not match "Aspirin". A case-sensitive search for " Nook " will avoid replacing the second word in "every nook and cranny", though might need manual approval in a text whose main character is named Nook.

  15. Yakusa Cobb said,

    June 2, 2012 @ 2:52 pm

    @John

    " Kindle " does not match "kindled"…

    Nor does it match:
    "Kindle is great" (no leading whitespace)
    "Powered by Kindle." (no trailing whitespace)

  16. Rubrick said,

    June 2, 2012 @ 2:52 pm

    Ah, the dawizard a little laziness can cause….

  17. codeman38 said,

    June 3, 2012 @ 10:12 am

    One of my favorite search-and-replace gaffes, which I discovered a decade ago yet which is still going strong in Google results a decade later, is when someone decides to change the hard-coded color scheme of an old non-CSS-compliant web site and doesn't pay enough attention to word boundaries in doing so.

    Or, as I put it back then: search and replace should be consideyellow harmful. Or is that considegreen? Consideblue?

  18. Emily said,

    June 3, 2012 @ 6:45 pm

    Besides the rhtmlberries mentioned earlier, there are also rphpberries.

  19. Matt said,

    June 3, 2012 @ 7:04 pm

    I don't think it would be that bad, Faldone; as I recall there's very little accessing of computer systems of any kind in War and Peace.

  20. William Steed said,

    June 3, 2012 @ 7:04 pm

    This problem is common to most of the public domain texts available from online stores. There are a number of publishers (which have been noted here on LL before) who publish extracts from Wikipedia and so on as e-books. The same publishers sell or make available poorly OCRed texts (in much worse condition than the Gutenberg Press editions) hacked into Kindle or Nook format. If you read their reviews, many of the complaints are about obvious artefact mistakes, missing chapters, cut sections and the obvious lack of editing.

  21. Brian said,

    June 4, 2012 @ 7:33 am

    Do these companies have IT people? Have any of them heard of a "regular expression"? (Most programming languages' "regular expressions" are not in fact regular expressions.) Global replace with a regexp is still probably a bad idea, but at least s/\.asp\b/\.html/g would be a lot better than blindly replacing "asp" with "html".

  22. Fiorentino said,

    June 5, 2012 @ 7:51 am

    "Nookd"

    Sounds like a perfectly cromulent word to me.

  23. Mark Mandel said,

    June 10, 2012 @ 3:07 pm

    Aftpoles.

RSS feed for comments on this post