The longer it is, the higher the rating:
We're talking about the length of wine reviews, measured in words, and the numerical rating given to the associated wine. (Well, actually, the length of the reviews is measured in terms of the output of a tokenizer that sets off punctuation as well as alphanumeric strings…).
The data comes from 84,158 reviews at Wine Enthusiast magazine, and the picture above shows a density plot of the relationship between review length and the numerical rating of the associated wine. The reviews range in length from four reviews with just eight lexical tokens each:
smells vegetal , and tastes unnaturally sweet .
simple and sweet . tastes like lemonade .
flat , fruity and lacking in complexity .
unripe , with feline aromas and flavors .
… to ones like this with more than 130:
if there 's any such thing as the perfect spanish red , pesus is it . a blend of 80% tempranillo with other grapes including cabernet sauvignon , this wine sees 200% new oak , resulting in a thick , dark , tannic beauty that bubbles over with toast , cola , mint , chocolate and spice aromas . the mouth is sheer heaven ; a mile deep in terms of berry flavor and more , but faultless and smooth . shows outstanding structure and power , and should age well for 15 – 20 years . hails from two 100 – year – old vineyards and some baby vines with but 25 years of age . crazy expensive but only 150 cases were made ; drink 2013 – 2030 .
The average review length is 49 lexical tokens.
The numerical ratings range from 80 to 100, with a mean of 87.4.
You can probably guess that the long review above was associated with a better rating than the earlier four short ones. And you'd be right: the long one scored 100, while the short ones scored 81, 80, 82, and 80. Much of your judgment would be based on the content ("feline aromas and flavors" is never going to beat "the mouth is sheer heaven", regardless of the rest of the context), but some of your reaction probably also comes from the simple length of the review.
In some situations, at least, a short evaluation does not seem as positive as a longer one. This is certainly the conventional wisdom about letters recommending someone for a job or for entrance to an educational program. A brief though positive recommendation ("X is excellent") seems less likely to impress readers than an equally positive letter that goes on at length about exactly how the writer knows that X is excellent.
In the case of the Wine Enthusiast reviews, we can be be more precise about the relationship. The length of the review alone explains almost a quarter of the variance in ratings (Adjusted R-squared = 0.2389). The coefficients emerging from linear regression are
rating = 82.1 + 0.11*nwords
This is not as good as we can do with a simple bag-of-words model of the content: predicting ratings based on the review-by-word matrix, limited to words that occur at least 20 times overall (5,516 of them) explains 73% of the variance. But still, 25% of the variance from a single, simple feature is not bad at all.
Update — Here are a few previous studies on the topic of the length of letters of recommendation, which it seems that psychologists have been studying for nearly 50 years. In the large recent machine-learning literature on "sentiment analysis" as applied to product reviews, I haven't been able to find a clear evaluation of the effect of review length (though length is clearly a useful feature in detecting spam reviews) — perhaps a reader can help us out here.
Albert Mehrabian, "Communication Length as an Index of Communicator Attitude", Psychological Reports 1965:
It was hypothesized that communications about liked objects are longer than communications about disliked objects. Ss wrote letters of recommendation about liked and disliked people under conditions where the topics to be covered in the letter were minimally or partially specified. The hypothesis was confirmed for conditions of both minimal and partial topic specification.
CHris Keinke, "Perceived Approbation in Short, Medium, and Long Letters of Recommendation", Perceptual and Motor Skills, 1978:
Mehrabian (1965) and Wiens, et al. (1969) found that subjects wrote longer letters of recommendation when the letter was for someone they liked rather than disliked. These results led Wiens, et al. to suggest that communication channels may provide a useful nonreactive measure of attitudes and motivations. The present research replicated the above encoding studies with a series of decoding experiments in which subjects rated short, medim-length, and long letters of recommendation written in English, German, or with deleted text. Short letters were evaluated as being least favorable towards the job applicant and long letters were evaluated as being most favorable toward the job applicant. It was concluced that attitudes attributed to length of letter are consistent with attitudes influencing length of letter. Subjects' limited awareness of the influence of length of letter on the evaluations was related to Nisbett and Wilson's (1977) argument about the weakness of introspection.
Stephen Colarelli et al., "Letters of Recommendation: An Evolutionary Psychological Perspective", Human Relations 2002:
This article develops a theoretical framework for understanding the appeal and tone of letters of recommendation using an evolutionary psychological perspective. Several hypotheses derived from this framework are developed and tested. The authors’ theoretical argument makes two major points. First, over the course of human evolution, people developed a preference for narrative information about people, and the format of letters of recommendation is compatible with that preference. Second, because recommenders are acquaintances of applicants, the tone of letters should reflect the degree to which the relationship with the applicant favors the recommender’s interests. We hypothesized that, over and above an applicant’s objective qualifications, letters of recommendation will reflect cooperative, status and mating interests of recommenders. We used 532 letters of recommendation written for 169 applicants for faculty positions to test our hypotheses. The results indicated that the strength of the cooperative relationship between recommenders and applicants influenced the favorability and length of letters. In addition, male recommenders wrote more favorable letters for female than male applicants, suggesting that male mating interests may influence letter favorability.