Thanks to several commenters on our recent most-a-thon ("Most", 7/31/2010; "Most examples", 7/31/2010; "Most and Many", 8/1/2010), I've learned about an interesting literature on the semantics, pragmatics, and psycholinguistics of most, which I think is worth collecting in one place for those unexpectedly unobsessive readers who don't repeatedly scan and cross-classify the comments on this kind of Language Log posting sequence.
These publications provide a variety of (mostly perceptual) evidence for the view that most really does mean "more than half", while offering a greater variety of theories about the strategies that (different sorts of) people use to determine whether this is true in particular cases.
In the face of these results, it remains puzzling why so many people think that a proposition like most X's are P imposes extra requirements, for instance that P-ish X's are a supermajority, or that P is the default state for X's. There are two obvious stories to tell about this: perhaps it's simply an illusion, and these extra meanings are just conversational implicatures arising in the usual way in certain contexts; or perhaps such meanings, having started out that way, have become conventionalized by some speakers. The variationist view is intrinsically plausible — word meanings drift in this way all the time — but if super-mosters or default-mosters were as common as they seem to be, you'd think that the several experiments described below would have run into empirical difficulties with a substantial subset of subjects. Which doesn't seem to have happened…
Yesterday evening, Itamar pointed us to Martin Hackl, "On the grammar and processing of proportional quantifiers: most versus more than half", Natural Language Semantics 17:63-98, 2009.
Hackl argues that most basically means "the largest subgroup bigger than any other subgroup" (which thus retains its etymological semantics as the superlative of more, while having the same truth conditions as "more than half" when there are just two subgroups). From Hackl's conclusion:
In the case of quantification, fulfilling this obligation [to furnish the pieces that processing theories require to draw systematic distinctions that occur during real time comprehension] hinges on what type of semantic primitives one assumes for quantification in natural language. A compelling example is provided by the pair most and more than half, which are standardly treated as truth-conditionally equivalent quantifiers. I presented experimental evidence from real time verification studies that differentiates these two expressions in ways that seem to correspond to specific differences in their form – the former being a superlative and the latter a comparative expression of proportions. […]
Specifically, I offer a compositional analysis of MOST as the superlative of MANY, arguing that the proportional reading is in fact a special case of the superlative reading. Extending the analysis to FEWEST offers an explanation for a currently unexplained systematic gap in the paradigm of proportional quantifiers, namely that FEWEST cannot be used as a proportional quantifier. Such an analysis presupposes that the set of semantic primitives of quantification includes e.g. degree expressions, measure phrases, and comparative and superlative operators but not relations between sets as GQT would have it.
This afternoon, Rachel directed our attention to Tim Hunter, Justin Halberda, Jeffrey Lidz, & Paul Pietroski, "Beyond Truth Conditions: The Semantics of most", SALT2008. (I'm ashamed to say that I discussed this paper, at modest length, in a post more than two years ago, "Sexual pseudoscience from CNN", 6/19/2008 — and then forgot about it until Rachel reminded me.) They take up Hackl's analysis, and (while basically agreeing with his characterization of the meaning of most), take the argument in a different direction:
In this paper we have argued against the claim that a competent speaker’s understanding of a sentence is exhaustively characterised by a truth condition. To do so we have presented evidence of asymmetries in speakers’ willingness to use various verification procedures: in Experiment 1, an apparent bias to use algorithms approximating a cardinality comparison rather than those based on one-to-one correspondence, and in Experiment 2, an insistence on an indirect method of approximation. These asymmetries would be surprising if the only constraint on the choice of verification procedures for a sentence was the requirement that the procedure must implement the sentence’s truth condition.
Back in June of 2008, I wrote this:
There's some experimental evidence that most people mostly interpret most in a way that lends itself to an easy transition to the generic plural. One piece of the puzzle: Tim Hunter, Justin Halberda, Jeff Lidz & Paul Pietroski, "Beyond Truth Conditions: The semantics of 'most'", SALT 18. They examined people's responses to statements like "Most of the dots are yellow", for displays in which the proportions of yellow and blue dots ranged between 1:1 and 2:1. They concluded that people interpret most in terms of a comparison of cardinalities mediated by the "Approximate Number System" (ANS), as discussed in Lisa Feigenson, Stanislas Dehaene and Elizabeth Spelke, "Core systems of number", Trends in Cognitive Sciences, 8(7): 307-314, 2004. Thus "participants' success rate … decreased as the ratio of the number of yellow dots to the number of nonyellow dots approached 1, closely matching the psychophysical function independently identified for the ANS".
According to Feigenson et al., there are two core cognitive systems dealing with numbers: an approximate representation of numerical magnitude, and a precise representation of distinct individuals. Perhaps linguistic equivocation between these systems — as well as the inadequacy of either system to express even simple propositions about statistical distributions — helps to explain the general tendency to derive propositions about generic group characteristics from propositions about differences between group averages, even when these difference are small relative to within-group variation.
And this evening, Alexander told us about Lidz, Pietroski, Hunter & Halberda, "Interface Transparency and the Psychosemantics of 'most'", Natural Language Semantics, 2009.
Among other things, Lidz and colleagues have psychophysical data showing that people, when shown an array of blue and yellow dots for a very brief time, and asked to respond immediately, Yes or No, whether most of the dots are blue (or yellow, etc.), consistently respond as if the question is whether >50% are blue (or yellow, etc.). They do not demand a big fat majority. This sort of data is interesting, in being different than either usage data from corpora, or data from responses to pragmatically contextualized questions.
From the start of their abstract:
This paper proposes and defends an Interface Transparency Thesis concerning how linguistic meanings are related to the cognitive systems that are used to evaluate sentences for truth/falsity: a declarative sentence is semantically associated with a canonical procedure for determining its truth value (cf. Dummett 1973, Horty 2007); and while this procedure need not be used as a verification strategy, competent speakers are biased towards strategies that directly reflect canonical specifications of truth conditions. Evidence in favor of this hypothesis comes from a psycholinguistic experiment examining adult judgments concerning ‘Most of the dots are blue’.
This sentence is true if and only if the number of blue dots exceeds the number of
nonblue dots. But this leaves many issues unsettled—e.g., how the second cardinality is specified for purposes of understanding and/or verification: via the nonblue things, given a restriction to the dots, as in ‘|Dot(x) & ~Blue(x)|’; via the blue things, given the same restriction, and subtraction from the number of dots, as in ‘|Dot(x)| – |Dot(x) & Blue(x)|’; etc. We obtained evidence in favor of the second hypothesis.
A bit of poking around in Google Scholar turns up a number of other interesting papers, including Halberda, Taing, and Lidz, "The Development of 'Most' Comprehension and Its Potential Dependence on Counting Ability in Preschoolers", Language Learning and Development 4(2):99-121, April 2008:
Quantifiers are a test case for an interface between psychological questions, which attempt to specify the numerical content that supports the semantics of quantifiers, and linguistic questions, which uncover the range of possible quantifier meanings allowable within the constraints of the syntax. Here we explore the development of comprehension of most in English, of particular interest as it calls on precise numerical content that, in adults, requires an understanding of large exact numerosities (e.g., 23 blue dots and 17 yellow is an instance of “most of the dots are blue”). In a sample of 100 children 2 to 5 years of age we find that (a) successful most comprehension in cases with two salient subsets is achieved at 3 years, 7 months of age, and (b) most comprehension is independent of knowledge of large exact number words; that is, knowledge of large exact number words is neither necessary, as evidenced by children who understand “most” but not “four,” nor sufficient, as evidenced by children who understand “nine” but not “most.”
Also Pietroski, Lidz, Hunter, and Halberda, "The Meaning of 'Most': Semantics, Numerosity and Psychology", Mind and Language, 24(4):554-585, 10/26/2009:
The meaning of 'most' can be described in many ways. We offer a framework for distinguishing semantic descriptions, interpreted as psychological hypotheses that go beyond claims about sentential truth conditions, and an experiment that tells against an attractive idea: 'most' is understood in terms of one-to-one correspondence. Adults evaluated 'Most of the dots are yellow', as true or false, on many trials in which yellow dots and blue dots were displayed for 200 ms. Displays manipulated the ease of using a 'one-to-one with remainder' strategy, and a strategy of using the Approximate Number System to compare of (approximations of) cardinalities. Interpreting such data requires care in thinking about how meaning is related to verification. But the results suggest that 'most' is understood in terms of cardinality comparison, even when counting is impossible.
And Jon Gajewski, "Superlatives, NPIs and Most", Journal of Semantics 27(1):125-137, 2010:
The ability of English determiner most to license negative polarity items (NPIs) has long stood as a puzzle for theories that follow Ladusaw (1979) in claiming that NPIs must appear in the scope of downward entailing (DE) operators. Most licenses NPIs such as any and ever in its restrictor but is not downward, or upward, entailing with respect to its restrictor. In this paper, I argue that despite appearances to the contrary, NPIs in the restrictor of most are in the scope of a DE operator. I make crucial use of a recent proposal by Hackl (2009) to compositionally analyze determiner most as a superlative expression. When the semantics of the superlative morpheme is spelled out correctly, this derives the result that most licenses NPIs in its restrictor.