[Attention conservation notice: this post wanders a bit too far into the psycholinguistic weeds for some readers, who may prefer to turn directly to our comics pages.]
In a recent paper, Ansgar D. Endressa and Marc D. Hauser document a puzzling result: Harvard undergraduates fail to recognize the regularities in "three-word sequences conforming to patterns readily learned even by honeybees, rats, and sleeping human neonates" ("Syntax-induced pattern deafness", PNAS, published online 11/17/2009).
Randy Gallistel is famous for his demonstration that rats sometimes seem smarter than Yale psychology students, but if worker bees and sleeping newborns really out-test Harvard undergrads, that would be a new low for Ivy-league intellect. In this case, however, it's not really true. The insects, rodents and infants would surely also fail in the form of the task inflicted on the Harvard students, who in turn would surely succeed if tested in the same way as the other animals cited.
Despite this disappointing come-back for eastern elitism, the experimental results are nevertheless interesting — although Endressa and Hauser offer an explanation that doesn't seem to me to go far enough. But first, let's rescue the honor of zip code 02138 by looking briefly at the perceptual successes of those bees, rats, and babies.
According to M Giurfa et al. ("The concepts of ‘sameness’ and ‘difference’ in an insect", Nature 410:930–933, 2001), honeybees can
… learn to solve 'delayed matching-to-sample' tasks, in which they are required to respond to a matching stimulus, and 'delayed non-matching-to-sample' tasks, in which they are required to respond to a different stimulus; they can also transfer the learned rules to new stimuli of the same or a different sensory modality. Thus, not only can bees learn specific objects and their physical parameters, but they can also master abstract inter-relationships, such as sameness and difference.
The set-up was a simple one:
Training was carried out using a Y-maze placed close to a laboratory window. Each bee entered the maze by flying through a hole in the middle of an entrance wall. At the entrance, the bee encountered the sample stimulus. The sample was one of two different stimuli, A or B, alternated in a pseudo-random sequence. The entrance led to a decision chamber, where the bee could choose one of two arms. Each arm carried either stimulus A or stimulus B as secondary stimulus. The bee was rewarded with sucrose solution only if it chose the stimulus that was identical to the sample.
After 60 training trials where the stimuli varied either in color or in (horizontal vs. vertical grating) pattern, the bees got to about 70% correct:
Bees who reached criterion on same-or-different color choice were able to generalize well to a test on same-or-different patterns, without further training; and similarly for bees trained on patterns and tested for generalization to colors. Although Giurfa et al. didn't check, I suspect that Harvard undergraduates could have done at least as well, as long as the apparatus was re-sized to fit them and they weren't required to fly in through the laboratory window.
Murphy et al. ("Rule learning by rats", Science 319:1849–1851, 2008) found that
Rattus norvegicus can learn simple rules and apply them to new situations. Rats learned that sequences of stimuli consistent with a rule (such as XYX) were different from other sequences (such as XXY or YXX). When novel stimuli were used to construct sequences that did or did not obey the previously learned rule, rats transferred their learning.
Their experiments worked as follows. Rats were trained with a pattern (either XYX, XXY, or XYY) made up of simple visual or auditory stimulus elements. For example, two tone bursts of different frequencies (e.g. A=3.2 kHz, B=9 kHz) would correspond to the XYX pattern either as ABA or BAB. If trained with XYX, therefore, the rats would get fed after hearing ABA or BAB, but not after hearing BBA, AAB, BAA, or ABB.
After acquisition, we presented them with transfer stimuli composed of two novel pure tones (C = 12.5 kHz and D = 17.5 kHz). The stimuli were counterbalanced so that the stimuli in the roles of A, B and C, D were reversed for half of the animals and were chosen to ensure that no common frequency relation was present between the pairs. If rats had simply learned something specific about the reinforced elements ABA, they should have been unable to choose CDC and DCD over CCD, DDC, CDD, and DCC. The amount of time that the rats kept their heads in the food trough during the final element of the sequence was used as a measure of learning. The results of the transfer test are presented in Fig. 1, excluding two rats that failed to learn the initial discrimination. More anticipatory behavior for food was exhibited during sequences that were consistent with the previously learned rule, even though the rats had never been presented with these particular instances and there was no food presented during the test.
The article doesn't specify how many training trials were required to achieve this level of Pavlovian conditioning, but whatever the learning curve, I expect that Harvard undergrads could do at least as well, though some other rewards would probably need to be substituted for rat chow.
As for those newborn babies, Gervain et al. ("The neonate brain detects speech structure", PNAS 105:14222–14227, 2008) found that they didn't even require training. From the abstract:
[W]e investigated the ability of newborns to learn simple repetition-based structures in two optical brain-imaging experiments. In the first experiment, 22 neonates listened to syllable sequences containing immediate repetitions (ABB; e.g., “mubaba,” “penana”), intermixed with random control sequences (ABC; e.g., “mubage,” “penaku”). We found increased responses to the repetition sequences in the temporal and left frontal areas, indicating that the newborn brain differentiated the two patterns. The repetition sequences evoked greater activation than the random sequences during the first few trials, suggesting the presence of an automatic perceptual mechanism to detect repetitions. In addition, over the subsequent trials, activation increased further in response to the repetition sequences but not in response to the random sequences, indicating that recognition of the ABB pattern was enhanced by repeated exposure. In the second experiment, in which nonadjacent repetitions (ABA; e.g., “bamuba,” “napena”) were contrasted with the same random controls, no discrimination was observed. These findings suggest that newborns are sensitive to certain input configurations in the auditory domain, a perceptual ability that might facilitate later language development.
There's no reason to suppose that the pre-attentive ability to perceive adjacent repetitions is lost later in life, even in the Ivy League. (In fact, there's some experimental evidence bearing on this very question, about which more later.)
OK, now to what Endressa & Hauser's subject really did (or failed to do). The experimental paradigm was this:
[P]articipants were told that they would listen to three-word sequences (triplets) and were instructed to memorize them …. Then 40 example triplets were played, all conforming to the same repetition pattern. Half of the participants were familiarized with AAB sequences where the first two categories were identical, and half were familiarized with ABB sequences where the last two categories were identical. Following this familiarization, participants were informed that the triplets had conformed to a common structure. The participants were then presented with pairs of new triplets made of new words, one conforming to an AAB pattern and one to an ABB pattern. Participants were asked to indicate which of the two triplets was like the familiarization triplets.
When the A's and B's were members of two semantic classes — specifically animals and clothing, e.g. bear-hawk-coat or dog-swan-shirt vs. coat-skirt-swan or hat-blouse-hawk — the kids were alright. At least, they gave the correct answer 64.25% of the time. This was significantly better than chance, though perhaps not up to the standard one would hope for at Harvard.
But when the A's and B's were members of two syntactic classes — specifically nouns and verbs, e.g. camel-pliers-furnish or window-baby-scavenge vs. annoy-guitar-napkin or carry-water-brick — the correct answer was given only 53% of the time, not significantly better than chance.
In a separate experiments, E & H showed that their subjects were generally able to classify the words correctly as nouns or verbs when explicitly asked to do so. And when the subjects were primed with a part-of-speech classification task, and "were informed that they would listen to triplets that conformed to an extremely simple pattern involving nouns and verbs and were instructed to find the relevant pattern", performance improved to 67.5%. This is better than peformance on the semantic task, though again it would be an average of D without grade inflation.
And there was definitely a curve to grade against:
… the group performance was carried by five participants who performed at 100% correct; after removing these participants, the group performance did not differ significantly from chance [(M=56.7%, SD=13.5%), t (14)=1.92, P > 0.05]. Moreover, even when including the five successful participants, 60% of the participants reported that they had not noticed the repetition pattern—although they were explicitly informed about a pattern before starting the experiment.
I'd be inclined to take these results as more evidence (if more were needed) that most Americans today are singularly clueless about all aspects of linguistic analysis. (And I'd try to find out what school those five clueful participants went to…)
But it's not just Americans, apparently, and we can't even blame the effects entirely on the often-noticed fact that in English, any noun can be verbed (and vice versa). E & H replicated the experiment in Hungarian, where (in general) nouns are nouns and verbs are verbs and never the twain shall meet. Now the sequences become things like szamár-kules-tép or ablak-gyerek-visz vs. kever-hallgat-kendö or szeret-ellát-szamár, and the subjects were recruited from the Hungarian Academy of Science rather than the Harvard University Study Pool – and again the result was failure, at least on the group level [(M=56.8%, SD=16.5%), t (19)=1.83, P>0.05]. So I'm convinced that for most unprimed subjects, part-of-speech is not a salient feature in sequence-familiarization experiments of this type.
I promised to explain how Endressa & Hauser don't go far enough. In my opinion, there are two failures, a general one and a specific one.
First, it seems to me that this experiment (and many others like it) are really not about syntax learning at all. Rather, they're extensions, into the dimension of time, of the research pioneered by Bela Julesz on pre-attentive texture discrimination in static visual displays. See e.g. Bela Julesz, "Textons, the elements of texture perception, and their interactions", Nature 290: 91-97, 1981:
The study of pre-attentive (also called effortless or instantaneous) texture discrimination can serve as a model system with which to distinguish the role of local texture element detection from global (statistical) computation in visual perception. [...] Without using the sophisticated techniques described in this article, it is not ovious, even in the case of pre-attentive texture discrimination, whether local differences between the texture elements directly contribute to discrimination or whether these differences are sensed in a global way only through differences in the statistics of the texture.
One of things that emerges from the earlier research that Endressa & Hauser cite (and much more that they didn't) is that repetition in time is probably a sort of temporal texton for most animals — that is, adjacent elements that are identical in terms of a salient feature form a local temporal pattern that "directly contribute(s) to pre-attentive texture discrimination", rather than constituting a "[difference] sensed in a global way only through differences in the statistics of the [temporal] texture". Endressa & Hauser come close to saying this, but they don't get there, and their bibliography fails to cite the texture-perception literature at all.
E & H show that (at least under the circumstances of their experiments) part-of-speech is not a feature whose repetition is tracked by human pre-attentive perception. This is interesting, but by no means a novel type of discovery. The texture-perception literature is full of contrasts among local features that "directly contribute" to texture discrimination, local features that contribute via their statistical distribution, and local features that are not accessible at all to pre-attentive texture discrimination. Here's one example of textural blindness from Julesz 1981, where local texture elements that are easily discriminated in isolation are ignored in texture perception:
One of the aims of texture-perception research has been to figure out what sorts of statistics of what sorts of local features play a role in pre-attentive texture discrimination — and the main method has been to accumulate lists of things that work and things that don't, and then to test perceptual models against those lists — for a review, see Michael Landy and Norma Graham, "Visual Perception of Texture" (in Chalupa & Werner, Eds. The Visual Neurosciences, 2004).
In the so-far-mostly-nonexistent field of temporal texture perception, and more specifically with respect to human pre-attentive texture perception in sequences of spoken words, E & H, have contributed to the lists of "things that don't work (very well)" — unprimed sequences of syntactic categories — and of "things that (sort of) work" — sequences of semantic categories. They call this "syntax-induced pattern deafness", and they argue that the failure of their subjects to notice part-of-speech textures "give(s) credence to the proposal … that syntactic processes are just as modular and inpenetrable as other perceptual processes". A more tentative way to put this would be to note that unprimed part-of-speech is not a salient feature to the perceptual system(s) responsible for pre-attentive detection of temporal textures, and that the other properties of these systems are at present mostly unknown.
And here's the second way I think that Endressa and Hauser should have gone farther than they did. They should have explicitly (rather than implicitly) retracted the claims in an earlier paper by Fitch and Hauser, ("Computational Constraints on Syntactic Processing in a Nonhuman Primate", Science, Vol 303, Issue 5656, 377-380 , 16 January 2004), which used familiarization/discrimination experiments with patterns XYXY and XYXYX versus XXYY and XXXYYY to argue that cotton-top tamarins could master different finite-state but not context-free grammars.
From the perspective of the paper currently under discussion, this earlier work looks rather like evidence for an asymmetry in the relative salience (to the tamarins) of gaining vs. losing local repetitions, with no implications for grammar learning at all. (See here, here, and here for further discussion. For an earlier attempt to relate such so-called "grammar learning experiments" to the literature on texture discrimination, see "Rhyme schemes, texture discrimination and monkey syntax", 2/9/2006. And for a recent contribution to the literature on animal learning of sequential acoustic patterns, see Caroline van Heijningen, Jos de Visser, Willem Zuidema, and Carel ten Cate, "Simple rules can explain discrimination of putative recursive syntactic structure by a songbird species", PNAS, published online 11/16/2009. )