Coco Krumme, "Velvety Chocolate With a Silky Ruby Finish. Pair With Shellfish.", Slate 2/23/2011:
Using descriptions of 3,000 bottles, ranging from $5 to $200 in price from an online aggregator of reviews, I first derived a weight for every word, based on the frequency with which it appeared on cheap versus expensive bottles. I then looked at the combination of words used for each bottle, and calculated the probability that the wine would fall into a given price range. The result was, essentially, a Bayesian classifier for wine. In the same way that a spam filter considers the combination of words in an e-mail to predict the legitimacy of the message, the classifier estimates the price of a bottle using its descriptors.
The analysis revealed, first off, that "cheap" and "expensive" words are used differently. Cheap words are more likely to be recycled, while words correlated with expensive wines tend to be in the tail of the distribution. That is, reviewers are more likely to create new vocabulary for top-end wines. The classifier also showed that it's possible to guess the price range of a wine based on the words in the review.
Winetalk was a LL theme for a while a few years ago:
"The legal treatment of quantifiers", 1/11/2004
"Just a trace of the obligatory rubber", 4/9/2004
"Editor impresses", 5/4/2004
"Ritual verbal enthusiasm for food", 5/11/2004
"Modification as social anxiety", 5/16/2004
"More winetalk imports into coffee lingo", 5/24/2004
"Grand Cru smackdown", 6/2/2004
"More on winetalk culture", 6/2/2004
"Apologia pro risu suo", 6/2/2004
"What do wine tasting notes communicate?", 6/5/2004
And again, for a bit, last year:
But it didn't occur to me that an online compendium of wine-tasting notes with prices and ratings would be a lovely subject for an after-hours Breakfast Experiment™.