Finch linguistics

« previous post | next post »

Andy Coughlan, "First evidence that birds tweet using grammar", New Scientist 6/26/2011:

They may not have verbs, nouns or past participles, but birds challenge the notion that humans alone have evolved grammatical rules.

Bengal finches have their own versions of such rules – known as syntax – says Kentaro Abe of Kyoto University, Japan. "Songbirds have a spontaneous ability to process syntactic structures in their songs," he says.

To show a sense of syntax in the animals, Abe's team played jumbled "ungrammatical" remixes of finch songs to the birds and measured the response calls.

The basic article is Kentaro Abe & Dai Watanabe, "Songbirds possess the spontaneous ability to discriminate syntactic rules", Nature Neuroscience 6/26/2011. And like the coverage in New Scientist, it's both true and misleading.

First, the idea that there's a kind of grammar in animal behavioral sequences is an old and well established one. This is trivially true if we take "grammar" in the mathematical sense of restrictions on the free monoid Σ* of sequences of elements from the set Σ. We might start, for example, with the elegant series of experiments and explanations begun by G.M. Hughes in "The Co-Ordination of Insect Movements: I. The Walking Movements of Insects", J. Exper. Biol. 1952, where Σ is the set of six possible limb protractions:

The rhythm of walking movements in Periplaneta, Blatta, Dytiscus, Hydropkilus, Carabus, Blaps and Chrysomela obeys two rules: (i) no foreleg or middle leg is protracted until the leg behind has taken up its supporting position; (ii) each leg alternates with the contralateral one of the same segment. […]

Several gaits have been observed, the most common order of protraction being R1, L2, R3, L1 , R2, L3, R1, etc., but these grade into one another if the ratio p/r is altered and the two rules obeyed. […] At very slow speeds the rhythm R3, R2, R1, L3, L2, L1 , R3, etc., may be present.

60 years later, we know quite a bit about the neurophysiology, physics, and evolution of gait syntax — for an interesting model of the neurophysiology, see e.g. R.M. Ghigliazza and P. Holmes, "A Minimal Model of a Central Pattern Generator and Motoneurons for Insect Locomotion", SIAM J. Applied Dynamical Systems 2004.

Similar motor-control "grammars" have been studied for other species and functions, for instance in the case of rodent grooming. See John C. Fentress, "Emergence of pattern in the development of mammalian movement sequences", Journal of Neurobiology, 23(10): 1529-1556, 1992; or H.C. Cromwell and K.C. Berridge, "Implementation of Action Sequences by a Neostriatal Site: A Lesion Mapping Study of Grooming Syntax", J. of Neuroscience 16(10) 1996:

The neostriatum and its connections control the sequential organization of action (“action syntax”) as well as simpler aspects of movement. This study focused on sequential organization of rodent grooming. Grooming syntax provides an opportunity to study how neural systems coordinate natural patterns of serial order. The most stereotyped of these grooming patterns, a “syntactic chain,” has a particularly stereotyped order that recurs thousands of times more often than could occur by chance. […] Our results identified a single site within the anterior dorsolateral neostriatum, slightly more than a cubic millimeter in size (1.3 × 1.0 × 1.0 mm), as crucial to grooming syntax. Damage to this site did not disrupt the ability to emit grooming actions. By contrast, damage to sites in the ventral pallidum and globus pallidus impaired grooming actions but left the sequential organization of grooming syntax intact. Neural circuits within this crucial “action syntax site” seem to implement sequential patterns of behavior as a specific function.

Or A.V. Kalueff and P. Tuohimaa, "Grooming analysis algorithm for neurobehavioural stress research", Brain Research Protocols, 13(3): 151-158, 2004, which used the relative frequency of  "ungrammatical" mouse grooming sequences as an index of stress.

The case of birdsong is different in ways that make it seem somewhat more like human language: the function of song patterns is social (mating, territoriality, etc.) rather than physical (locomotion, grooming, etc.); as a result, pattern perception is just as important as pattern production; in some species of birds, the song patterns are quite varied and complex, at least compared to gaits and grooming sequences; and in some species, the song patterns are learned rather than innate.

But the idea that birdsong sequences have grammars (in the sense of restrictions on the free monoid of birdsong "syllables"), and that these grammars matter to singers and to their audiences, is not news. Thus Evan Balaban, "Bird Song Syntax: Learned Intraspecific Variation is Meaningful", PNAS 85(10) 1988:

Song syntax, defined as orderly temporal arrangements of acoustic units within a bird song, is a conspicuous feature of the songs of many species of passerine birds. While syntactical features play a role in interspecific song recognition by males of many bird species, syntax variation within species and female responsiveness to song syntax have received little attention. This report demonstrates that differences in naturally occurring learned song syntax within a species whose syntax varies geographically are behaviorally salient to both male and female birds. The salience of culturally transmitted intraspecific differences in song syntax has implications for the process of conspecific song perception and may be involved in the regulation of genetic exchange between large populations of swamp sparrows (Melospiza georgiana).

(I'm not sure who first documented the "orderly temporal arrangements of acoustic units within a bird song" in the scientific literature, but I'm pretty sure that this goes back at least to the 1950s. Handbooks for how to recognize birds by their songs must have been published much earlier. And the conscious common-sense recognition of these "orderly temporal arrangements" must be as old as our species, if not older.)

The idea of birdsong syntax is not news even in the specific case of Bengal finches — thus  Kazuo Okanoya, "The Bengalese Finch: A Window on the Behavioral Neurobiology of Birdsong Syntax", Annals of the New York Academy of Sciences, Volume 1016 (Behavioral Neurobiology of Birdsong) pp. 724-735, June 2004:

Bengalese finches have been domesticated in Japan for 240 years. Comparing their song syntax with that of their wild ancestors, we found that the domesticated strain has highly complex, conspicuous songs with finite-state syntax, while the wild ancestor sang very stereotyped linear songs.

So anyhow, there must be something new in the Kentaro Abe & Dai Watanabe Nature Neuroscience paper, but it's surely not the idea that some non-human species exhibit grammatical patterning in their behavioral sequences, nor the idea that some non-human species are differentially responsive to "grammatical" vs. "ungrammatical" patterns.

But I'm not going to tell you  — yet — what the new contribution of the Watanabe paper is. You can read the paper and work it out for yourself, or wait and read my explanation in a later post. Instead, today, I'm going to give a bit more background about birdsong.

Let's start with Finch phonology. One the key things about (the relevant kinds of) birdsong is that songs are made up of sequences of "syllables", where for a given bird, the syllables are drawn from a finite set of more-or-less well-separated kinds of tweets.

These syllables themselves are learned, in the sense that different birds of the same species may have different repertoires of syllables. Ofer Tchernichovski has done some lovely work on this process, as discussed in "Emergence of birdsong phonology", 10/11/2003. Here's an animation showing the emergence of discrete syllable types in the development of one specific zebra finch:

As explained in N.A. Hessler and A.J. Doupe, "Singing-Related Neural Activity in a Dorsal Forebrain–Basal Ganglia Circuit of Adult Zebra Finches", J. Neuroscience 1999:

In zebra finches, undirected song contains a variable number of motifs, a stereotyped series of approximately 3–10 discrete vocal elements (syllables).

Their illustration:

A spectrogram (plot of frequency vs time, with loudness indicated by the darkness of the signal) of a zebra finch song shows the characteristic features of song. Stereotyped sequences of syllables (lower case letters), called “motifs” in zebra finches (indicated by dark bars), may be sung from one to several times in succession, preceded by a variable number of short introductory elements (i). Amplitude oscillogram of song is plotted below spectrogram.

Thus there are two levels of finch syntax — the sequencing of syllables in a motif, and the sequencing of motifs in a song bout. In the zebra finch, both levels are relatively stereotyped, though not completely rigid, and we can ask what sort of grammar (or what sort of neural architecture) this behavioral syntax implies.

Interestingly, though the syntax of zebra-finch song bouts is extremely simple — to a first approximation, it's just a variable number of motif repetitions —  it's trivial to prove that a finite-state (markovian) grammar is not adequate to describe it.

After producing a motif, our hypothetical markov process will be in a state where it must decide whether to stop or to produce another motif. By the markovian assumption ("history is bunk"),  the probability of stopping will always be the same, regardless of how many motifs have been produced in the current bout.

Let Pstop denote this probability of stopping after a given motif. Then the the probability of producing a sequence of length 1 will be Pstop , the probability of producing a sequence of length 2 will be (1-Pstop)Pstop , and the probability of a sequence of length N will be (1-Pstop)N-1 Pstop . This implies that the modal number of motif repetitions will always be 1, and that the relative frequency of higher numbers of repetitions will fall off exponentially, at a rate determined by Pstop.

But zebra finches don't sing that way. Courtesy of Ofer Tchernichovski and Dina Lipkind, here's some data from our old friend zebra finch 109:

As you can see, the modal number of motif repetitions is actually two, not one — and an attempt to fit an exponentially-decaying number of repetitions fits rather badly. It doesn't work much better to assume that there is a higher probability of continuing after the first motif, and then a constant probability of continuing after subsequent repetitions, If we extrapolate from the probability of continuing to a third repetition, we continue to over-predict longer sequences:

Of course, we could handle such data with a higher-order markov process, with a sequence of N motif repetitions stipulating appropriately different probabilities of continuing after each step. But what actually seems to be going on is that the probability of continuing itself is decaying exponentially. Here's a plot that shows the empirical probability of continuing after N motif repetitions, compared to an exponential model:

Here are similar plots from another bird:

And the probability-of-continuing graph for a third bird:

It's easy to imagine a simple neural architecture that would generate patterns like these — for instance, we might have a circuit generating the sequence of syllables in a motif, driven by an exponentially-decaying input. But neural-circuit simplicity and grammatical-hierarchy simplicity are not the same thing. Our hypothetical exponentially-decaying input driver represents a simple "memory" that is beyond the capacity of a stochastic finite-state automaton.

We'll return to this point when we pick up the question of what was really new in the Abe & Abe paper. Meanwhile, in the unlikely event that you haven't already had too much birdsong blogging, here's more:

"Emergence of birdsong phonology", 10/11/2003; "Birdsong and speech: together in the genome?",  4/7/2004; "Watch out for those Wallonian finches", 5/22/2007; "Dialect variation in the terminal flourishes of Flemish chaffinches", 5/27/2007; "Finches again", 6/9/2007; "Finch phrase structure?", 10/1/2007; "Creole birdsong?", 5/9/2008; "A multi-generational bioprogram? Derek Bickerton objects", 5/10/2008; "What's in a generation or two?", 5/12/2008; "Musical protolanguage: Darwin's theory of language evolution revisited", 2/12/2009; "Bickerton on Fitch", 2/15/2009.

And some posts on (non-)tests of grammatical complexity:

"Language in humans and monkeys", 1/16/2004; "Hi Lo Hi Lo, it's off to formal language theory we go", 1/17/2004; "Cotton-top tamarins: on the road to phonology as well as syntax?", 2/9/2004; "Humans context-free, monkeys finite-state? Apparently not.", 8/31/2004; "Rhyme schemes, texture discrimination and monkey syntax", 2/9/2006; "Learnable and unlearnable patterns — of what?", 2/25/2006; "Starlings", 4/27/2006; "Separating species with bullets", 4/28/2006.


  1. Dan T. said,

    July 13, 2011 @ 9:34 am

    And there are already tweets going around like:

    New Scientist: "Birds tweet using grammar." If they can do it, what on earth's stopping all of you?

  2. Jerry Friedman said,

    July 13, 2011 @ 10:33 am

    Not to get petulant, but…

    How can a publication called the New Scientist not give the scientific name of a species that's the subject of an article? How can they not mention that the taxon in question, Lonchura striata var. domestica or just L. domestica, is not found in the wild and is of uncertain and possibly hybrid origin—especially since they compare the behavior of birds in the lab to those in the wild?

    On a more linguistic subject, Abe and Watanabe use the English name Bengalese Finch, which is considerably more popular than Bengal finch. (Its other name, used in America, is society finch.) None of the few Google Books hits on Bengal finch were to scientific studies of its song. I find it odd that the New Scientist shortened Bengalese to Bengal.

    Also almost linguistically, finch has been used for birds of several families that are not each other's closest relatives. I suppose it's naive of me to wish that the New Scientist wouldn't just say finch as if these birds were in the same family as the goldfinches, house finch, etc. that are familiar to many English speakers in Britain and North America. (Society finches and zebra finches are in the family Estrildidae, and goldfinches and the like are in the family Fringillidae.)

  3. NW said,

    July 13, 2011 @ 10:54 am

    New Scientist thinks there are places called lake Victoria and mount Everest, and that kelvin has a zero plural. It's the one publication that drives me to a pitch of prescriptivist peeving.

  4. Mark Mandel said,

    July 13, 2011 @ 1:07 pm

    Your complaint is in line with the SI's usage conventions for the kelvin — 'Because it is an absolute unit, the plural "kelvins" should be used for any quantity of temperature other than 1 kelvin (e.g. water freezes at 273.15 kelvins)' — but I lean here toward colloquial flexibility when no ambiguity results.

    How hot is it today? Well, here in Philadelphia it's 85 (Fahrenheit), or 29 Celsius. We don't usually bother with "degrees" if we're naming the scale, and I've always understood expressions like "232 Kelvin" as "232 on the Kelvin scale", which of course is a scale of temperature and therefore is commonly (though not StrIctly) expressed as degrees Kelvin. "kelvin" as the name of a unit of measurement, rather than the name of a scale, is the exception, and I don't see any ambiguity resulting from "232 Kelvin", or even "232 kelvin" as a temperature. The one place I'd find it seriously objectionable is when used as a measure ofdifference of temperature:

    The addition of 5 calories ("of heat" understood) will raise the temperature of 5 grams of water by 1 kelvin, or 1 gram of water by 5 kelvin(*s).

  5. Mark Mandel said,

    July 13, 2011 @ 1:09 pm

    The title of this column almost had me thinking "Parlez-vous finçais?" (or maybe that should be "fringillidais").

  6. Jerry Friedman said,

    July 13, 2011 @ 2:33 pm

    An early identification guide describing bird songs is John B. Grant, Our Common Birds and How to Know Them (1891), though I imagine it's not the first. As you can see on that page, he gets his descriptions from earlier writers. Whether that counts as documenting the "orderly temporal arrangements of acoustic units within a bird song" might depend on your definitions. If you're talking about describing variation (for species in which individuals give varied songs) by anything more than giving a few variations, I'm sure that's later.

    I've managed to interest only one birder in the fact that vowels in the transcriptions of whistled bird vocalizations seem to be determined by the second formant. For instance, tweet means a note that rises in pitch and tew descends. There are great differences among individual transcribers who speak English, as well as differences among speakers of different languages. Have linguists studied this? Are imitative animal names found in all languages? And how is it possible that the Spanish equivalent for bang is zas?

    @Mark Mandel: I'd have said "raise the temperature of water by 5 kelvins" is the kind of situation where the plural should be the most acceptable, not the least.

    Parlez-vous pinsonnais? Sprechen sie Finkisch?

  7. Brett said,

    July 13, 2011 @ 2:40 pm

    @NW, Mark Mandel: I find "kelvins" to sound completely wrong in any context, whatever CGPM says, and I think many other physicists would also. (I don't know how it's used in other sciences, however.) Continued use of "degrees Kelvin" is common, at least in speech, and I still consider "kelvin" to be just an abbreviation for this.

    Wikipedia suggests that the use of "kelvins" serves to emphasize the fact that the unit is absolute and can be manipulated algebraically. Such manipulation is possible, but the situation is not so simple as described in the article. It is well known to physicists (and it's important in most modern low-temperature physics) that temperature is NOT simply a measure of the "amount of mean energy available among elementary degrees of freedom of the system," and I suspect this affects our perception of the usefulness of a nonzero plural.

  8. Mark Mandel said,

    July 13, 2011 @ 5:27 pm

    @Jerry Friedman: Whoops, sorry! I meant

    The addition of 5 calories ("of heat" understood) will raise the temperature of 5 grams of water by 1 kelvin, or 1 gram of water by 5 kelvin*(s).

    Thanks for catching that.

    @Brett: I also find "kelvins" weird. I was referring to the standard, not *de*ferring to it.

    In the use for temperature difference, as Jerry pointed out, I wasn't careful in typing and fell on my asterisk. I still prefer "degrees Kelvin" to "kelvins", but I was trying, unsuccessfully, to say that if you're not going to use the word "degree" there you'd better be sure to pluralize "kelvin" lest you perpetuate the already excessive confusion between temperature and temperature difference, as in lines like "Yesterday it was 40 [°F], and today it's doubled to 80!" [That's supposed to be in strikethrough style; I hope it comes through as such.]

  9. Steve Kass said,

    July 13, 2011 @ 7:54 pm

    Here are a mathematician’s first observations on Abe and Watanabe:

    For argument’s sake, suppose these finches could not in fact learn “context-free” (*, see below) rules. Suppose they could not in fact generalize rules about multiply-embedded words from a training set of singly- and non-embedded ones.

    Suppose that they could, however, learn simple non-context-free languages by mastering finite-state transition rules or predictive probabilities (as I’m led to believe previous experiments suggest).

    Such finches might still show the sort of call-response shift indicated in figure 3, leading researchers to overreact. Why? Because ungrammatical words like AES and AES2 are not only ungrammatical in the context-free center-embedding language, they are also ungrammatical in the non-context-free language (again, see * below) of at-most-depth-one-center-embedded words.

    For both AES and AES2, the center length-5 subword on its own is ungrammatical. If the birds learned several non-context-free rules like “the sequence P-Q-* must be followed by the sequence q-p,” (“*” being like the paper’s “C”), as opposed to a rule about center embedding, they would still recognize as ungrammatical both AES and AES2.

    Among the test words, there were presumably some that violated center embedding only at the outermost level of nesting. Did the authors look at birds’ responses to these separately, or might they have considered an alternate experimental design that used a suite of novel test words that was sufficiently biased towards violations at the outer nesting levels to give more confidence in a conclusion of non-context-free grammar learning?

    It’s not entirely clear to me how deeply “center-embedded” the authors’ test strings were. Figure 3 and the Online Methods section illustrate depths 2 and 3 only, though those might be simplifications for the reader. Is the actual suite of test words available somewhere?

    If the center-embeddings in the suite of words used were deep (compared to just 2 or 3 as the maximum nesting level as in the description and figures), and/or if the novel words used for Figure 3 categories AES and AES2 contained embeddings that were much (i.e., more than 1 nesting-level) deeper than the words used for habituation, and if some attention in analysis or design were given to the fact that some ungrammatical words are ungrammatical at depths in the habituation/learning set and others only at deeper levels, then a claim of context-free language learning is more supportable. Nonetheless, this is fascinating stuff!

    From a mathematical point of view, putting a given upper bound on the word length makes any context-free language non-context-free. A bounded-length subset of a truly context-free language can always be described with a finite-state machine, though there might be a large number of rules. The smaller the bound, the simpler the describing machine. While "all center-embedded words” is context-free, “all center-embedded words with embedding depths at most three” is not. I would only be guessing, but I suppose linguists and neuroscientists have some very interesting insights and questions about what human brains are doing when they appear to master context-free rules in their language. (The fence the man the dog the cat scratched bit jumped collapsed, for example.)

    Incidentally, is Figure 2d of the paper mislabeled? I don’t think it should say “Shuffled.”

    Also is the fact that for the first experiment, increases in call counts are considered important, but in the others, decreases are considered noteworthy? And is it statistically valid to report paired t-test results for Original-vs-X for several values of X without some sort of multiple-tests correction (or was there such a correction)? The error bars of the significant results and the non-significant results overlap in some cases…

  10. Ø said,

    July 13, 2011 @ 7:58 pm

    "Kelvins" sounds wrong to me, period.

    What about decibels?

  11. Ran Ari-Gur said,

    July 14, 2011 @ 8:45 pm

    @Steve Kass: I find your terminology very confusing. To me "non-context-free" means "not context-free", i.e. "context-sensitive", but you seem to be using it to mean "regular"? To a mathematician, regular languages are actually a subset of context-free languages. If the strings of a language have a bounded length, then there are finitely many strings, and the language is therefore trivially context-free: you can just have one production rule per string in the language. (More generally: if a finite-state machine can recognize it, then it's guaranteed to be regular and context-free, since you can just have one non-terminal per state and one production rule per distinct state transition.)

  12. Steve Kass said,

    July 15, 2011 @ 12:09 am


    Thanks for the clarification. You’re right.

    For what it’s worth, my goof comes from the (sloppy) use of “context-free” in symbolic dynamics to mean context-free but not regular — “strictly context-free” one should at least say. I carelessly used “not” to mean non-strictly-context-free, but in a specific “direction” towards less complexity. (Attempting a joking excuse, my restricted use of “not” might have been understandable in a setting with more context…)

    The authors’ wrote: “Humans are supposed to differ from other animals in their ability to handle complex grammars such as context-free grammar, which involves the embedding of phrases into other phrases.”

    What they consider novel seems to be the possibility that the birds they studied learned to recognize a particular rule (which they call center-embedding) that is a characteristic of a context-free but non-regular language (but such a language is only non-regular if it is infinite and embeddings of arbitrary length are permitted). It’s confusing (for me, at least, since in mathematics we aren’t bound by finiteness) to figure out how best to identify and distinguish two finite languages, both regular, but one of which is arguably more complex because it’s naturally a length-limited subset of a non-regular language.

    My point, which I hope was not lost, was this: It’s not clear the experiment had the ability to distinguish between the birds learning only regular-language-type rules that led them to appear to have learned something more complex, or their learning a more complex rule of the type that could define a non-regular context-free language.

    So yes, where I said non-context-free, I meant regular, and thanks for taking the time to clear this up.

  13. nicholas said,

    July 15, 2011 @ 8:18 am

    @ Jerry Friedman

    Chapter XII in William Gardiner (1832) considers birds as source for composers and he presents an assortment in notation on page 67 (but no finches). In ch XII he writes:
    "Thus, 'The London bird-catchers prefer the song of the Kentish gold-finches and the Essex chaffinches, and the Surrey nightingales, to those of Middlesex.' These varieties may be compared to the dialects of different provinces."
    And then "If these little prisoners could add words to their song, how would they bemoan their loss of liberty!"

    The Music of Nature
    Or, an Attempt to Prove that What is Passionate and Pleasing in the Art of Singing, Speaking and Performing upon Musical Instruments, is Derived from the Sounds of the Animated World. With curious and interesting illustrations. William Gardiner. Boston 1832

    The book is on Google:

  14. Jerry Friedman said,

    July 15, 2011 @ 10:46 am

    @nicholas: Thanks, that's earlier than examples of musically notated bird song I'd found. By the way, the canary, whose song Gardiner gives, is a finch (in the original sense).

    I don't have a good sense of pitch, but I strongly suspect that hearing bird songs as consisting of notes of the Western scale is similar to hearing human utterances (or bird songs) as phonemes of one's native language.

    ObNoWordForThat: Doesn't Russian have a vocabulary to describe nightingale sounds? (I thought there were examples in Turgenev's Sportsman's Sketches, but I can't find it at Google Books.) The lack of such words in English probably means that the English don't listen to nightingales, or something.

  15. Ran Ari-Gur said,

    July 17, 2011 @ 9:12 am

    @Steve Kass: Thanks for clarifying! Yes, that makes perfect sense now. (To be honest, it's really hard to talk about these distinctions while using the terms completely correctly — I made a very similar error when I wrote "'not context-free', i.e. 'context-sensitive'", which is a total brain fart on my part, because of course that is not what "context-sensitive" means. In reality all context-free languages are "context-sensitive", and plenty of languages are neither context-free nor "context-sensitive". Just, somehow, your use of "non-context-free" confused me. I guess my instinctive use of the terms still takes "context-free" and "context-sensitive" as opposites, since that's the distinction that's most useful to me, but "regular" as a subset of "context-free"; it makes sense that your instinctive use would also take "regular" and "context-free" as opposites, if that is the distinction that is most useful to you.)

RSS feed for comments on this post