Latest stock market casualty: consumer dictionary companies?

« previous post | next post »

A recent Associated Press wire story about the declining stock market contained an optimistic note from Phil Orlando, chief equity market strategist at Federated Investors. Orlando says the market is in decent shape, with two exceptions:

"Our view has been that the market, generally speaking, is in pretty good shape with the exception of the financial service companies and the consumer dictionary companies," he said.

The consumer dictionary companies? Are Merriam-Webster, American Heritage, et al. in trouble? Will they be needing a massive bailout from the Federal Reserve? Our lexicographical colleagues need not worry, since the AP article appears to be reflecting a different kind of dictionary trouble: the dreaded Cupertino effect.

The text above is how the article appears in many online sources, including Google News, Yahoo News, Time, Forbes, the Washington Post, the New York Times, the Boston Globe, the Houston Chronicle, the Denver Post, and the AP's own website. So it must be true, right? Not if you check the AP story in the Washington Times or the Rocky Mountain News, where the crucial passage reads slightly differently:

"Our view has been that the market, generally speaking, is in pretty good shape with the exception of the financial service companies and the consumer discretionary companies," he said.

Kudos to the copy editors at those two newspapers for catching the error that their competitors missed. This certainly appears to be yet another "Cupertino," the name we use for spellchecker-induced errors (after the habit of older spellcheckers to suggest Cupertino as a "correction" for cooperation). Cupertinos come in two flavors: real words miscorrected into other words due to a gap in the spellchecker's dictionary, or misspelled words changed to the wrong word when the correct one is not listed as the spellchecker's first suggestion. This would seem to be an example of the latter, since discretionary is a word we'd expect to be in any good spellchecker's wordlist.

But what was the misspelling that led to a substitution of dictionary when discretionary was intended? Based on the lesson in spellchecker algorithms that Thierry Fontenelle of the Microsoft Natural Language Group provided a few months back, the misspelling would need to have a shorter "edit distance" to dictionary than discretionary. Since the difference between the two words is simply three deleted letters, that means two letters would have to be deleted from discretionary to make the edit distance closer to dictionary: thus, disctionary, dicrtionary, or dicetionary. Of those, dicetionary seems like the most plausible typo, though that's still a stretch. Any other ideas?



11 Comments

  1. lm said,

    May 10, 2008 @ 5:46 pm

    Dicsetionary yields dictionary as the first suggestion, at least when I type it in Gmail.

  2. lm said,

    May 10, 2008 @ 5:49 pm

    …as do dircetionary and dircsetionary. That last one is my favorite.

  3. john riemann soong said,

    May 10, 2008 @ 7:14 pm

    I do wonder though — haven't print dictionaries been taking a hit these days? And do most consumers want to pay a *subscription* to a dictionary anyway?

    I never look things up in a print dictionary these days — it's just too troublesome. Why flip back and forth between pages when you can just google it?

  4. Karen said,

    May 10, 2008 @ 10:01 pm

    But Googling doesn't allow you to drift along the page, finding all the other words. Dictionaries are wonderful books.

  5. Garrett Wollman said,

    May 11, 2008 @ 12:04 am

    You inspired me to think a bit about who actually owns those dictionary companies. The OED is of course published by Oxford University Press, one of the Great Universities of England. I recall that Cambridge UP publishes dictionaries as well. Collins is part of Rupert Murdoch's News Corporation empire. _American Heritage_ is apparently owned by Malcolm S. "Steve" Forbes, Jr., but the dictionary of that name is published by Houghton Mifflin. There are two companies publishing "Webster" dictionaries: Random House is part of the (German) Bertelsmann media empire, and Merriam-Webster, formerly G. & C. Merriam, is owned by Encyclopædia Britannica, Inc. and thus (says Wikipedia) by billionaire Jacqui Safra.

    With the exception of Merriam-Webster, the remaining companies are all large publishing houses–among the very largest in their respective niches (OUP and CUP among university presses, HarperCollins and Random House among trade publishers, and Houghton Mifflin Riverdeep in education). None of them seem in any imminent danger, although dictionaries are surely a tiny part of their overall print business.

  6. Janice Huth Byer said,

    May 11, 2008 @ 4:38 am

    Consumer discretionary companies, according to a Google, are those "that depend on consumers to spend excess money", meaning they're properly called consumer discretionary income companies. Could it be that copy editors, who aren't business specialists, chose 'dictionary' over the incomprehensible reduction?

  7. Steven said,

    May 11, 2008 @ 6:29 am

    Without knowing the exact edit distance algorithm, it's impossible to know the incorrect spelling, but my money is on dicsretionary, inverting the sc of discretionary. To get discretionary, you have to invert two letters. To get dictionary, you have to delete a sequence of three.

  8. vic said,

    May 11, 2008 @ 9:28 am

    Actually "disctionary" is not only plausible, but more likely IMO than "dicetionary". I run into an autosave problem quite often where the hand is quicker than the screen. So I usually keep typing when a program suddenly decides that autosave takes priority and freezes the window. The result is usually unaffected, but, occasionally, the pause is long enough that the buffer fills and a few characters get left out. In this scenario, the missing characters must be consecutive. I actually ran all the combinations of missing single letters and flips and every one of them gave "discretionary" as either the only or the first alternative (I just used right click in Firefox). So the proximity theory seems to hold water. There is also a possibility of adding or replacing letters rather than merely omitting. But that seems a remote possibility.

  9. Oskar said,

    May 11, 2008 @ 3:47 pm

    This is a case where we don't have to speculate, calculating the edit distance (properly called Levenshtein distance) is trivial to do, every student of computer science that studies dynamic programming learns the algorithm (the algorithm itself is stunning, really fun to program). As such, I figured that there would be an online calculator. Lo' and behold, there is!

    http://gtools.org/levenshtein-calculate.php

    I get

    discretionary -> dictionary: 3

    dicrtionary -> dictionary: 2
    dicetionary -> dictionary: 1
    disctionary -> dictionary: 1

    dicrtionary -> dictionary: 2
    dicetionary -> discretionary: 2
    disctionary -> discretionary: 2

    So all three of them seem as likely answers, with dicetionary and disctionary slightly more likely. As an aside, the Firefox spell-check only gives "dictionary" for all three of these.

    I'm almost certain that I learned a variant of the edit distance algorithm that calculated all words that had a certain edit distance to another word (so you could find all words with edit distance 1 to "dictionary", for instance). That would be really convenient to have in these Cupertino discussions.

  10. Thierry Fontenelle said,

    May 11, 2008 @ 8:12 pm

    Even when one knows the details of the edit distance algorithm, it is impossible to determine what spelling the writer used originally. We can only speculate. Here are the suggestions provided by the spell-checker in Microsoft Office 2007 for the various misspellings cited in this post and in the comments:

    disctionary: -> dictionary
    dicrtionary: -> dictionary, discretionary
    dicetionary: -> dictionary, discretionary
    dicsetionary: -> discretionary
    dircetionary:-> dictionary, discretionary
    dircsetionary:-> discretionary
    dicsretionary: -> discretionary
    disctionary: -> dictionary

    It is of course up to the writer to click on the correct suggestion when a typo is squiggled by the speller.

  11. Grant Barrett said,

    May 12, 2008 @ 7:04 am

    There are two companies publishing "Webster" dictionaries: Random House is part of the (German) Bertelsmann media empire, and Merriam-Webster, formerly G. & C. Merriam, is owned by Encyclopædia Britannica, Inc. and thus (says Wikipedia) by billionaire Jacqui Safra.

    There are at least three major ones and several other minor ones. The other is John Wiley & Sons, Inc., which published Webster's New World dictionaries.

RSS feed for comments on this post