## "It was as if a light had been Nookd…"

Here on Language Log we've often talked about unfortunate search-and-replace miscorrections, which now seem to be infecting poorly edited e-reader texts. The latest example, via Kendra Albert on Jonathan Zittrain's Future of the Internet blog, is a doozy. The Nook edition of Tolstoy's War and Peace (in its English translation) has been de-Kindled, quite literally. Every instance of the text string kindle has been replaced by Nook.

(Click to embiggen.)

Albert explains:

9. ### Henning Makholm said,

June 2, 2012 @ 9:25 am

Text editors typically have "replace all" commands that work in one of two ways: Either they replace every occurrence of the search string anywhere in the document, or they replace every occurrence found between the current cursor position and the end of the document.

This matches well with the hypothesis that the intent wast to rewrite the advertising at the end of the book about their fine Kindle products. Probably the person doing the search-and-replace thought his program worked on a "from here to the end" basis but actually got the "everywhere" semantics.

This problem is exacerbated by the fact that some text editors (of either kind) don't bother to jump to the last instance replaced after the operation finished, which would have alerted the user to the over-application.

10. ### Dave said,

June 2, 2012 @ 9:28 am

@Simon: Thanks very much—this is fun!

I'm enjoying reading about a dead person's hand still clhtmling the gun, about lovers grhtmling each other, and about Jews living in the Dihtmlora. I even found a record of you finding Rhtmlutin!

Unfortunately, nobody seems to be concerned with the problem of blhtmlhemy these days.

11. ### Emily said,

June 2, 2012 @ 10:33 am

Google also gets results for "htmlerger syndrome" and "htmliration."
Sadly, there are none for "htmlirated consonants" or "Cleopatra was bitten by an html"… yet.

12. ### Rachael said,

June 2, 2012 @ 12:11 pm

There's a certain series of parenting books which are popular in the US, which have been superficially localised into British English by search-and-replacing "diaper" with "nappy", "crib" with "cot", etc. So it advises you to help your child's language development by "descoting" things to them.

13. ### Richard Sabey said,

June 2, 2012 @ 1:04 pm

@Henning Makholm Not only that, any decent search-and-replace can be made to search case-sensitively, and to search for the given search string as a word (rather than as a string anywhere in the text, e.g. in mid-word). Thus the Nook and html cock-ups betray particularly inept use of search-and-replace: searches for the specific words Kindle and asp (case-sensitive w.r.t. Kindle) would not have caused so much damage.

14. ### John said,

June 2, 2012 @ 2:20 pm

Honestly, does nobody know how to put spaces around their replace queries? " Kindle " does not match "kindled", and ".asp " does not match "Aspirin". A case-sensitive search for " Nook " will avoid replacing the second word in "every nook and cranny", though might need manual approval in a text whose main character is named Nook.

15. ### Yakusa Cobb said,

June 2, 2012 @ 2:52 pm

@John

" Kindle " does not match "kindled"…

Nor does it match:
"Kindle is great" (no leading whitespace)

16. ### Rubrick said,

June 2, 2012 @ 2:52 pm

Ah, the dawizard a little laziness can cause….

17. ### codeman38 said,

June 3, 2012 @ 10:12 am

One of my favorite search-and-replace gaffes, which I discovered a decade ago yet which is still going strong in Google results a decade later, is when someone decides to change the hard-coded color scheme of an old non-CSS-compliant web site and doesn't pay enough attention to word boundaries in doing so.

Or, as I put it back then: search and replace should be consideyellow harmful. Or is that considegreen? Consideblue?

18. ### Emily said,

June 3, 2012 @ 6:45 pm

Besides the rhtmlberries mentioned earlier, there are also rphpberries.

19. ### Matt said,

June 3, 2012 @ 7:04 pm

I don't think it would be that bad, Faldone; as I recall there's very little accessing of computer systems of any kind in War and Peace.

20. ### William Steed said,

June 3, 2012 @ 7:04 pm

This problem is common to most of the public domain texts available from online stores. There are a number of publishers (which have been noted here on LL before) who publish extracts from Wikipedia and so on as e-books. The same publishers sell or make available poorly OCRed texts (in much worse condition than the Gutenberg Press editions) hacked into Kindle or Nook format. If you read their reviews, many of the complaints are about obvious artefact mistakes, missing chapters, cut sections and the obvious lack of editing.

21. ### Brian said,

June 4, 2012 @ 7:33 am

Do these companies have IT people? Have any of them heard of a "regular expression"? (Most programming languages' "regular expressions" are not in fact regular expressions.) Global replace with a regexp is still probably a bad idea, but at least s/\.asp\b/\.html/g would be a lot better than blindly replacing "asp" with "html".

22. ### Fiorentino said,

June 5, 2012 @ 7:51 am

"Nookd"

Sounds like a perfectly cromulent word to me.

23. ### Mark Mandel said,

June 10, 2012 @ 3:07 pm

Aftpoles.