Toshitaka N. Suzuki, David Wheatcroft & Michael Griesser, "Experimental evidence for compositional syntax in bird calls", Nature Communications 2016:
Human language can express limitless meanings from a finite set of words based on combinatorial rules (i.e., compositional syntax). Although animal vocalizations may be comprised of different basic elements (notes), it remains unknown whether compositional syntax has also evolved in animals. Here we report the first experimental evidence for compositional syntax in a wild animal species, the Japanese great tit (Parus minor). Tits have over ten different notes in their vocal repertoire and use them either solely or in combination with other notes. Experiments reveal that receivers extract different meanings from ‘ABC’ (scan for danger) and ‘D’ notes (approach the caller), and a compound meaning from ‘ABC–D’ combinations. However, receivers rarely scan and approach when note ordering is artificially reversed (‘D–ABC’). Thus, compositional syntax is not unique to human language but may have evolved independently in animals as one of the basic mechanisms of information transmission.
The article is open access, so you can read it yourself.
The key idea is that the sequence of notes ABC means "look out!" (they gloss it "scan for danger") while a sequence of D notes (e.g. DDDDDDDDD) means "come here" (they gloss it "approach the caller"):
They show this in the now-standard way — ABC calls, played over a loudspeaker, generally produce a much larger number of observed "scans" in birds that hear them, compared to background noise or D calls. ABC-D sequences produce a somewhat smaller number of scans, but still more than D calls alone:
And listening tits approach the loudspeaker more often when a sequence of D notes is played, compared to background noise or an ABC sequence. The sequence ABC-D elicits a somewhat smaller percentage of approaches, but still more than ABC or BN:
So far, this is just, as we might say, one thing after another. In a combination of note sequences, each subsequence has (a weakened form of) the effect that it has in isolation.
For the authors of this paper, the key question is, does order matter? What's the effective meaning of ABC-D vs. D-ABC?
This is not logically equivalent to the question of whether these calls have a "syntax" — there might be other reasons why order matters. We might be studying bird-call pragmatics, or bird-call discourse analysis, rather than bird-call syntax. But it's certainly a relevant question.
And the result they report is that D-ABC is less effective than ABC-D at eliciting scans:
So this is suggestive — though it would be more persuasive if they also had an explanation for why ABC-D is less communicatively effective than either ABC or D alone, since whatever the explanation — distraction from responding to one of the calls due to responding to the other? — it might also explain part of the reduced response to the reversed order.
However, their explanation in terms of the birds' order-expectation makes sense. They write that
Tits produce ‘chicka’ calls when approaching and mobbing predators, and these calls contain a number of unique call types composed of different note types, mainly A, B, C and D notes. A, B and C notes are typically produced in combination with other note types, resulting in AC, BC or ABC calls. In contrast, D notes are produced as a string of seven to ten notes (hereafter referred to as a D call) and are also used in non-predatory contexts, such as when a bird visits its nest alone and is recruiting its mate. In predatory contexts, D notes are often produced in combination with other note types and typically appear at the end of note strings, such as AC–D, BC–D or ABC–D calls. Thus, D notes are both produced alone and in combination with other notes, suggesting that they modify the meaning of ABC calls to elicit appropriate mobbing responses to different predator types.
In other words, these birds are used to hearing "AC-D, BC-D or ABC-D calls", and not to hearing D-whatever calls. This difference in order-expectation is analogous to human-language syntax or morphology, but it's also analogous to other behavioral-sequence regularities.
For the authors, the crucial point is that there's a behavioral-sequence regularity, combined with "meanings" for the sequence elements, combined with a difference in communicative effectiveness for normal vs. reversed orders.
If you've been paying attention to the graphs, you may have noticed another puzzle. What their Figure 4b shows as the approach percentage for ABC-D in experiment 2 is (closer to) the approach percentage that they reported for D alone in Figure 3b:
From measurements made on pixel positions in their Fig 3b, I estimate the approach percentage for D at 62%, and for ABC-D at 48%; similar measurements on figure 4b yield an approach percentage for ABC-D of 66%.
Thus the difference in approach percentage between ABC-D in experiment 2 (66%) and ABC-D in experiment 1 (48%) is almost as large as the difference between responses to ABC-D in experiment 1 (48%) and responses to D-ABC in experiment 2 (24%).
I had to resort to this pixel-measurement because nowhere in the paper do the authors report the actual numbers. They give various measurements of statistical significance, but not even the mean values (of scan counts and approach percentages), much less a full listing of the underlying data. This omission is lamentable, in my opinion.
Another lamentable omission is the set of stimuli — good practice these days would be to make the acoustic stimuli (and the raw behavioral data) available as on-line supplementary material. This is relevant (for example) because there might be coarticulatory issues that produce differential artefacts in different artificial note sequences of the sort that they used. (Their stimuli, as far as I can tell from their description, were constructed by concatenating individual notes from different calls, with short silences in between. The notes used came from the calls of 17-21 different birds, but each stimulus was apparently constructed from notes drawn from the calls of a single bird (?).
The possibility of perceptually-relevant coarticulatory effects seems to be supported by the examples of natural calls shown in T.N. Communication about predator type by a bird using discrete, graded and combinatorial variation in alarm calls", Animal Behavior 2014. So it would be good to have the complete inventory of original (natural) calls, as well as the set of stimuli used and the detailed recipe used to combine them.
All in all, this is an interesting paper, but I feel that the editors of Nature Communications have seriously failed in their responsibility to require (or allow?) adequate documentation of the work.
[h/t Sybil Shaver]
Update — I've changed the title from "Birdsong syntax" to "Bird syntax", to remedy a possible misunderstanding.
As I understand things, researchers make a distinction between "songs", which are typically territorial or mating displays generally produced by males, and "calls", which are functional vocalizations produced by both sexes.
It's been clear for a long time that bird songs are often complex sequences of well-defined smaller units, often called "motifs" and "syllables", which occur in regular but not invariant patterns. There's an interesting literature on the nature of such patterns, and their relationship (or not) to the "grammars" of human languages.
But those motifs and syllables don't have any independent meaning, as far as anyone knows, and for that matter the songs as a whole don't mean anything besides "I'm a skillful member of my species", and "check me out" or "this is my territory".
What's special about this paper is that it discusses two different calls, with different functional meanings, which are often fluently combined in a fixed order; and the experiments show that the order matters, in that the behavioral response to re-ordered calls is much weaker than the response to calls in the natural order.