Embuggerance & Feisty

« previous post | next post »

Problems with Google's metadata are a recurrent theme here on Language Log. Now on his blog Stephen Chrisomalis reports a stunning cascade of screw-ups that led to Google Scholar producing the following citation:

Embuggerance, E., and H. Feisty. 2008. The linguistics of laughter. English Today 1, no. 04: 47-47.


  1. Mark Liberman said,

    October 22, 2009 @ 9:56 am

    For the benefit of future scholars, a screenshot:

    And Google Scholar's BibTex citation:

    title={{The linguistics of laughter}},
    author={Embuggerance, E. and Feisty, H.},
    journal={English Today},
    publisher={Cambridge Univ Press}

    Perhaps we should continue the tradition of metonymic names for new linguistic natural kinds (e.g. eggcorn and crash blossom), and use embuggerance for cases where the automatic tagging of entities and relations goes astray.

    Quantitatively, the embuggerance rate (or simply the embuggerance) would be the false positive rate — number of false positives divided by the sum of the numbers of true positives and false positives — or one minus the precision.

    And of course, individual tagging errors (of the false-positive variety) would be embuggerances.

    I should add that all automatic tagging methods with decent coverage (= "recall") have at least a moderate embuggerance rate; and non-automatic tagging methods generally have poor coverage, as well as their own sorts of errors. Still, this looks like a case where some algorithm improvement would be in order.]

  2. Amy Reynaldo said,

    October 22, 2009 @ 10:10 am

    And use feisty embuggerance for particularly delicious examples of crazy tagging.

  3. Sili said,

    October 22, 2009 @ 10:14 am

    Not a bad collective noun either: an embuggerance of googlers.

    Or googlists, perhaps. Googletonians?

  4. Stephen Chrisomalis said,

    October 22, 2009 @ 11:56 am

    Thanks for the link! I second the motion to adopt embuggerance or feisty embuggerance to refer to auto-tagging monstrosities like this one.

  5. Mertseger said,

    October 22, 2009 @ 12:05 pm

    There's an even deeper irony here, of course, in that the 'E' in "E Embuggereance" came from "Escalate". Thus, I'd say that it appears Google's mission in the metadata is to escalate embuggerance.

    I know that I shall now make every effort to promote the meme of Dr. Escalate Embuggerance and Dr. Holistic Fiesty.

  6. Zoe Larivelt said,

    October 22, 2009 @ 2:19 pm

    This is brilliant. Thank you.

  7. dr pepper said,

    October 22, 2009 @ 3:05 pm

    Milord, the depravity of the accused is beyond the capacity of any civilized man to fathom. He has confessed, no he has boasted of his crimes. When confronted by the constables over the report of embuggerance he replied, and i quote, "mere embuggerance is for peasants! I insist on feisty embuggerance!"

  8. Lazar said,

    October 22, 2009 @ 3:06 pm

    For years I've been looking for a word to top "embiggen", and now I've found it – "embugger(ance)"!

  9. Bobbie said,

    October 22, 2009 @ 3:15 pm

    Years ago I had to compose a satire for a writing class. I ended up with a "research paper" which used the names of various "authors" in the citations. I remember that one cited study was written by Fitz and Startz…. and another by Far Blonjet. But the authors' names of Embuggerance and Feisty are way beyond my attempts at fiction!

  10. Peter Taylor said,

    October 22, 2009 @ 5:51 pm

    Eggcorn and crash blossom have the virtue of not having prior existence as noun phrases. The same cannot be said of embuggerance, so choosing it as a name is deliberately choosing to be ambiguous (at best). I therefore propose that "Embuggerance and Feisty" is a more suitable name for this kind of error. People who are familiar with the words individually but not with the phrase will fail to parse it and so realise that they're missing something, rather than finding themselves trying to work out why a metadata issue is comparable to e.g. Alzheimer's syndrome.

    [(myl) On the other hand, this is a proposal for a new technical term to join a group that already includes precision, recall, sensitivity, and specificity, all of which are terms of ordinary language as well as terms of art in the domain of statistical classification.]

  11. Mark Liberman said,

    October 22, 2009 @ 9:12 pm

    We should not fail to note that Michael Quinion covered "embuggerance factor" at World Wide Words back in 2001, along with brief mentions of the other Australian military slang terms "a poofteenth of stuff-all" and "oh-dark-hundred".

  12. Simon Cauchi said,

    October 22, 2009 @ 10:03 pm

    Australian military slang terms . . . "oh-dark-hundred".

    An hour when soldiers get up—as Izaak Walton put it—"so early, that they prevent the Sun-rising".

  13. Simon Cauchi said,

    October 22, 2009 @ 10:22 pm

    Oops. Lift your eyes off the page for a second and you misquote. Walton wrote (of otter-hunters) that they are up and about "so early, that they intend to prevent the Sun-rising".

  14. Graeme said,

    October 23, 2009 @ 6:52 am

    Bugger has long since unmoorred from its homophobic roots. In Australian usage it came to mean 'stuff up' as in 'to bugger up'. And, like 'bastard', once a noun inviting a duel, it can as much be a term of grudging endearment as a pejorative (as in 'how are ya, ya old bugger/bastard?')

  15. The Volokh Conspiracy » Blog Archive » Profs. Embuggerance & Feisty said,

    October 23, 2009 @ 8:58 am

    […] explains how this came about. Thanks to Language Log for the pointer. Categories: […]

  16. links for 2009-10-23 « a historian’s craft said,

    October 23, 2009 @ 10:00 am

    […] Language Log » Embuggerance & Feisty ah, those well known scholars Embuggerance & Feisty (2008). Cambridge University Press, no less. (tags: hilarious meta google fail languagelog) […]

  17. Etl World News | EMBUGGERANCE AT GOOGLE SCHOLAR. said,

    October 23, 2009 @ 11:37 am

    […] around, not coddled, if this sort of thing is to stop. (I got to Chrisomalis's post via the Log; In case you're curious, the actual author of "The linguistics of laughter" is […]

  18. Jim said,

    October 23, 2009 @ 3:10 pm

    Am I the only one who sees the word "bugger" embedded in there, as in, um, "to fuck in the ass, usually in an unpleasant or non-consensual way". Or am I just the only one willing to note that?

    Not that it matters, I suppose. False metadata tagging could be seen as a metaphoric parallel to non-consensual anal intercourse. Getting fucked over, that is.

  19. Martin Watts said,

    October 25, 2009 @ 5:40 am

    I remember reading about the word embuggerance back in the '80s in "The Observer", possibly in John Ayto's column. This was about British military slang and referred to the difficulty of operating at the end of a very long supply chain – particularly getting something done in the Falklands. The idea was that anything that normally would be a minor inconvenience becomes an embuggerance and causes real problems.

  20. Sandra Wilde said,

    October 28, 2009 @ 6:00 pm

    Could somebody explain? How did Google produce this? I'm just not getting the joke.

    [(myl) Stephen Chrisomalis explains it. Basically, elsewhere on the page where this short review appears, there was a line that read

    embuggerance, elevate, feisty, holistic,

    as part of list of words treated in a different review. Apparently Google's software for finding authors is triggered by patterns like

    Yuan, Jiahong, Liberman, Mark,

    which it then interprets as naming "Jiahong Yuan and Mark Liberman". Applied somewhat promiscuously in this case, it yielded the author pair

    Elevate Embuggerance and Holistic Feisty

    which was appropriate abbreviated to "E. Embuggerance and H. Feisty". Unwise? In this case, yes. A common type of problem with statistically-trained pattern-matching software information extraction? Also yes.]

  21. Steven E. Landsburg said,

    November 10, 2009 @ 10:48 am

    I've posted another example at http://www.thebigquestions.com/2009/11/10/on-the-amazon/, where one of my readers pointed me here. Thanks for sharing this.

  22. Lorna’s JISC CETIS blog » When automatic metadata generation goes bad… said,

    November 24, 2009 @ 7:08 am

    […] curious incident was originally reported by Stephen Chrisomalis and subsequently picked up by Language Log. The comments on the latter post are particularly entertaining. Mark Liberman helpfully provides […]

RSS feed for comments on this post