The architecture of speech

« previous post | next post »

Or maybe it should be the sound pattern of architecture? Anyhow, Ariel Goldberg sends this interesting demonstration of the fact that Google Books still sometimes gets jiggy with its category choices:

Things have gotten better — though maybe less poetic? —  since Geoff Nunberg pointed out ("Google Books: A Metadata Train Wreck", 8/29/2009) that

William Dwight Whitney's 1891 Century Dictionary is classified as "Family & Relationships," along with Mencken's The American Language. A French edition of Hamlet and a Japanese edition of Madame Bovary both classified as "Antiques and Collectibles." An edition of Moby Dick is classed under "Computers": a biography of Mae West classified as "Religion"; The Cat Lover's Book of Fascinating Facts falls under "Technology & Engineering." A 1975 reprint of a classic topology text is "Didactic Poetry"; the medievalist journal Speculum is classified "Health & Fitness." […]

Of the first ten hits for Tristram Shandy, four are classified as fiction, four as "Family & Relationships," one as "Biography & Autobiography," and one is not classified. Other editions of the novel are classified as "Literary Collections," "History," and "Music." The first ten hits for Leaves of Grass are variously classified as "Poetry," "Juvenile Nonfiction," "Fiction," "Literary Criticism," "Biography & Autobiography," and mystifyingly, "Counterfeits and Counterfeiting."

Various editions of Jane Eyre are classified as "History," "Governesses," "Love Stories," "Architecture," and "Antiques & Collectibles" ("Reader, I marketed him").

 



13 Comments

  1. Lars said,

    April 11, 2018 @ 2:27 pm

    It even sometimes "helpfully" fixes up titles which it thinks are journal references. Åke Edwardsson's detective story "Rum nummer 10" becomes "Room, Issue 10".
    https://books.google.dk/books?id=O4INAwAAQBAJ&pg=PA332&lpg=PA332&dq=Christer+B%C3%B6rge&source=bl&ots=qlNPXdHyVN&sig=kzw6to9GN122G5WhvAGQBUd7uok&hl=en&sa=X&ved=0ahUKEwjt6paiz7DSAhUFkywKHSRVDRUQ6AEIWzAL#v=onepage&q=Christer%20B%C3%B6rge&f=false

    (hoping the link makes it through)

  2. 번하드 said,

    April 11, 2018 @ 3:04 pm

    Well, don't be too hard on Google.
    I fondly remember accidentally finding Richard Harris' "Roadmap to Korean" (btw an excellent read) in the geography section of a brick-and-mortar bookstore in Seoul.

  3. S Frankel said,

    April 11, 2018 @ 3:13 pm

    Maybe the Chomsky category was influenced by the influential architecture book "A Pattern Language" (Christopher Alexander et al.)

    "Leaves of Grass" does, actually, sound like a counterfeiting process. And all of the categorizations of Tristam Shandy are obviously correct, including "uncategorized."

  4. AntC said,

    April 11, 2018 @ 4:44 pm

    Dead trees categories/sorting can do that too: Author indexes put Mc-/Mac- prefixed names at the start of the 'M's. Like Macchiavelli, the well-known Celt.

  5. Daniel Barkalow said,

    April 11, 2018 @ 4:52 pm

    Google's probably confused by it having the same subject matter as other MIT Press books like "The Architecture of the Language Faculty" and "The View from Building 20".

  6. David Eddyshaw said,

    April 11, 2018 @ 5:42 pm

    Macchiavelli is a perfectly normal Glaswegian name.

  7. David Eddyshaw said,

    April 11, 2018 @ 5:46 pm

    McChiavelli.

  8. martin schwartz said,

    April 11, 2018 @ 7:10 pm

    Years before computers could be blamed, the UC Berkeley Library had
    Jan Gonda's The Vision of the Vedic Poets in the Optometry Library.
    Ca. 1960 he late classicist Israel Drabkin phoned the original Barnes and Noble,a wonderful huge bookstore, to ask if they had Vergil's complete works;the clerk reported that they didn't, but did have an opera by him.
    As fior Macchiavelli, I long thought that Matchabelli Perfume was a sexuallly pun on Macchia
    velli, but I learned that it was originally synthesized by Prince (!) Georges Matchabelli, a Georgian nobleman in the US.
    Martin Schwartz

  9. Y said,

    April 11, 2018 @ 9:02 pm

    A more alarming problem is inconsistent use of search criteria. If I look for a word which appeared in a book from 1850, the book might appear when the date range is set to 1800-1900, but not if it's set to "up to 1900", that sort of thing.

    I fear that with creeping neglect, Google book search might end up nonfunctional, as has already happened with the search function in Google's usenet archive.

  10. Andreas Johansson said,

    April 12, 2018 @ 7:06 am

    I once tried and failed to explain to a flesh-and-blood librarian why I was surprised to find "The Teutonic Knights: A Military History" under Orders and Societies rather than Military History.

  11. KB said,

    April 12, 2018 @ 7:33 am

    > the fact that Google Books still sometimes gets jiggy with its category choices

    Curious as to what was the intended meaning of "gets jiggy" here?

    [(myl) OED sense 1b. "Mentally agitated or disturbed; crazy."]

  12. J. Goard said,

    April 13, 2018 @ 5:25 am

    @KB:

    I was wondering that, too.

    In the sense of 'be stylish, cool', there's a definite suggestion of socially fearless goofiness — Samuel L. Jackson's coolness is not the "jiggy" kind — but I don't find that enough to use the word for just anything that's weird or sketchy.

  13. Emily said,

    April 14, 2018 @ 12:12 am

    Still more anecdotes of this type here:
    https://archive.org/stream/literaryblunders00wheauoft#page/72/mode/2up
    Relevant: "The elaborate work by Careme, Le Patissier Pittoresque (1842), which contains designs for confectioners, deceived the bookseller from its plates of pavilions, temples, etc., into supposing it to be a book on architecture, and he accordingly placed it under that heading in his catalogue."

RSS feed for comments on this post