A multi-generational bioprogram? Derek Bickerton objects

« previous post | next post »

Yesterday, I described Olga Feher's demonstration that species-typical songs emerge, over several generations, in an isolated colony of zebra finches founded by birds raised in isolation ("Creole birdsong", 5/9/1008). I compared this pattern to Derek Bickerton's "bioprogram" hypothesis, first put forward in his 1981 book Roots of Language, and discussed again in his 2008 book Bastard Tongues ("A Trail-Blazing Linguist Finds Clues to Our Common Humanity in the World's Lowliest Languages"). As the Wikipedia article on the "language bioprogram hypothesis" explains, Derek's idea is that

when the linguistic exposure of children in a community consists solely of a highly unstructured pidgin[,] these children use their innate language capacity to transform the pidgin, which characteristically has high syntactic variability, into a language with a highly structured grammar.

I also mentioned some of the subsequent debate over the bioprogram theory of creolization, quoting from an encyclopedia article by John Rickford and Barbara Grimes. Some of this debate has focused on whether the process of regularization in creole languages is complete in the first generation of native learners, or takes several generations. I observed that Bickerton's general idea ought to be consistent with a multi-generational emergence of a cognitive phenotype, where the species-typical pattern results from the accumulation of learning biases over several iterations.

However, some of Bickerton's critics have seen multi-generational creolization as evidence against his hypothesis. And to my surprise, it seems that he agrees with them. In an interesting comment on my post, he wrote:

Mark, you say that "Where social learning is involved, perhaps it's normal for the phenotype to emerge over multiple generations." And you may well be right, since social learning has nothing to do with creolization. How can you "socially learn" something for which you have no model, which didn't exist until you made it?

If I understand Derek's question ("How can you 'socially learn' something for which you have no model, which didn't exist until you made it?"), it's easy to answer him by examining the behavior of some simple learning algorithms. A maximally simple example is the emergence of shared and categorical beliefs, starting from a random and gradient distribution across individuals, in a situation where beliefs are sets of binary (or otherwise quantized) random variables. This only requires individuals to exhibit their beliefs randomly to their neighbors in a social network, and to adjust their beliefs in the direction of their neighbors' behaviors, based on the sort of "linear learning" mechanisms that have been demonstrated in thousands of experiments over the past century or so. There's an informal discussion of this in some slides for a talk of mine ("Simple models for emergence of a shared vocabulary", LabPhon 8, 2002).

At the start of this process, beliefs and behaviors are highly variable and inconsistent, both within and across individuals; but at the end, beliefs and behaviors are crisply consistent and uniform. Nevertheless, there is no specific "tutor", either genetic or social. The coherent shared pattern emerges as a by-product of positive feedback in the linear belief-update process. This process matches probabilities when learning from environmental variation, but "maximize" its estimates, driving the vector of probabilistic beliefs towards the corners of the hypercube, when individuals learn from one another's behavior.

Like all learning algorithms, this one embodies a bias towards certain outcomes: in this case, the space of possible beliefs is defined by the set of features (i.e. random variables) to be learned. Aside from this, there is no bias in the model toward any particular set of values — the choice happens by random symmetry-breaking — but it's easy to design learning models with a bias towards more or less specific results. And it's equally easy, in general, to tune the rate at which such biases work. As a result, I can't see any reason why Derek's "bioprogram" couldn't be modeled as a set of biases — in perception, memory, or production — that would take several generations to reach a species-typical fixed point if started from an unnatural and variable initial state.

And Olga Feher's experiments with zebra finches seem to provide a simple real-world model of exactly this sort of phenomenon.

Isolated zebra finches, raised without an adult model, never learn a normal song. Instead, they produce something called "subsong", in which both the individual syllables tend to be simpler and less consistent, as well as longer, often by a large factor; and the syllable sequences tend to be shorter and less consistently structured.

Olga and her collaborators showed that individual juveniles will learn from (recorded examples of) the abnormal isolate song, if they themselves are isolated and given no other models. However, their learning is biased in the direction of normal zebra finch song. And if the process is iterated, each isolated bird learning from a model provided by the previous one, the biases accumulate so as to restore the normal zebra finch song patterns over about three to four generations.

The same sort of thing happened in a somewhat more natural "colony" experiment, where the founding male and female birds were raised in isolation, and then allowed to develop without further interference over several generations.

Going back to the creole language case, Derek writes:

All the "evidence" for multiple-generation creolization is spurious. The Nicaraguan Sign Language case is sheer happenstance–it just happened that only children above 10 were recruited in the early years, We simply don't know what would have happened if much younger children had been recruited from the start, I see no logical reason why they would have needed anything beyond home sign input. Roberts's claim for a two-generation model for Hawaiian Creole is based on bizarre and counterfactual assumptions at least some of which are contradicted by her own data.

Derek might well be right about this, I don't know. But neither in the earlier post nor in this one do I try to engage the empirical questions abut multi-generation creolization. Rather, I'm interested in the logic of the debate, which seems to be understood as follows: if a species-typical behavior appears in the first generation of individuals raised with no conspecific models, then it must be the result of a very specific genetic "bioprogram"; but if the behavior emerges from development over several generations, it must be the result of general cognitive abilities. Neither of these implications seems to me to be logically valid.

In particular, I don't see why a multi-generational version of Derek's "language bioprogram hypothesis" would be conceptually different, in any striking way, from a single-generation version. But then, most of the many arguments about innateness have always seemed like false dichotomies to me.



5 Comments

  1. john riemann soong said,

    May 10, 2008 @ 10:40 am

    "And if the process is iterated, each isolated bird learning from a model provided by the previous one, the biases accumulate so as to restore the normal zebra finch song patterns over about three to four generations."

    A question that occurs to me: "why does the accumulation stop?" I mean, surely the isolated zebra finches don't know think, "this is the normal pattern we've been trying to achieve, let's stop innovating now."

  2. Keith Dede said,

    May 10, 2008 @ 11:19 am

    In the case of human language creolization, it seems just as likely that that "set of biases" could be socio-culturally determined by the power relationships among the speakers of different languages in a language contact environment.

  3. John Cowan said,

    May 10, 2008 @ 1:32 pm

    "[W]hy does the accumulation stop?"

    You are thinking of the innate rule as a simple unidirectional tendency, but it doesn't have to be. For example, let's say that the winning strategy for male foo birds is to sing "tweet-tweet-tweet". The more chirps, the more females, but also the more predators, and three is the correct balance point.

    Now a foo bird who was hardwired to sing "tweet-tweet-tweet" wouldn't be very adaptable, just adapted to the particular balance that exists now. So foo birds have evolved the following rule instead: if you hear less than three chirps from your conspecifics while you are young, sing one chirp more, and if you hear more than three chirps, sing one chirp less.

    Your isolated foobird will now sing "tweet", making him safe but sorry; still, if he does reproduce his offspring will sing "tweet-tweet" and their offspring will have restored the canonical value. Mutant offspring whose rule is mis-set will go too far, sing "tweet-tweet-tweet-tweet", and get eaten; but if not, their offspring will sing "tweet-tweet-tweet", thus reaching the canonical value again.

    But if the canonical value changes for any reason, it's possible for a mutation changing the value to spread even though the bulk of the population is still trying to conserve the old value. That's why flexibility beats rigidity.

  4. Bob Ladd said,

    May 10, 2008 @ 2:02 pm

    I completely agree with Mark that the one-generation vs multi-generation dichotomy is spurious. With Dan Dediu and Anna Kinsella, I've just published a commentary in the new online journal Biolinguistics on the multi-generation manifestation of language-related biases (http://www.lel.ed.ac.uk/~bob/Ladd-Dediu-Kinsella.pdf). Very specific language-related cognitive biases or Bickerton-style bioprograms, assuming they exist, are clearly distinct from "general cognitive abilities", even if they take a few generations to manifest themselves in a consistent way in a community's language. After all, all language acquisition involves SOME social input, and we know that languages change over time. The ongoing work by Wendy Sandler, Mark Aronoff and their colleagues on a Bedouin sign language that is now in its fourth or fifth generation strongly supports the idea that the unfolding of some very general properties of language doesn't happen in one generation. That doesn't make it any less remarkable that languages all converge on similar sets of "design features".

  5. KeaponLaffin said,

    May 29, 2008 @ 1:10 pm

    -Rather, I'm interested in the logic of the debate, which seems to be understood as follows: if a species-typical behavior appears in the first generation of individuals raised with no conspecific models, then it must be the result of a very specific genetic "bioprogram"; but if the behavior emerges from development over several generations, it must be the result of general cognitive abilities. Neither of these implications seems to me to be logically valid.-

    Well if you put it that way, then I'll agree. Both views are too simplistic.
    I don't know anything about the debate specifically, but..here it goes:
    Bioprogram seems like a fancy word for instinct. Now, if someone is arguing that a specific species of bird is genetically programmed to sing a specific song. That can be tested, but I find it unlikely. Now having an instinct to sing a TYPE of song…as in this species of bird only sings Jazz and another only does Opera..That I can find reasonable.

    -general cognitive abilities-?? Again..what does that mean?
    And again, this can be tested. Is a bird 'smarter' because it can learn complex vocalizations? And what type of complex vocalizations? As one commenter pointed out in the related post, some birds only have 2 'words'. I love you and Go away. But what if one bird's 'I Love You' is more musically complex than a bird from another species, and that other species uses tools? I'm no neuroscientist but I know in humans speech and singing ability are not related to 'general tool use abilities' or 'general problem solving abilities'.

    It must be both, and neither in their specifics. Birds are born with the instinct to sing, their ability to sing is related to how species-wise they are good at singing. If noone teaches them how to sing, they'll figure it out to the best of their singing ability, none of this 'general cognitive abilities' nonsense. This can all be tested, look at their different brain structures. Heck, get a few used to the sound of an fMRI, put em in a plastic cage and have em sing. Put a bird species that uses few 'words' but is very complex and creative in it's singing in there. Then another species that sounds like crap but has many short simple 'words'. Then toss in something like a mockingbird. I think they learn new songs even as adults. Contrast and compare, differential analysis,behavioral studies, whatever, do good science and figure it out. Humans have the technology to do something besides guesswork.
    It's just the old Nature VS Nurture argument. It's both, the 'VS' part is nonsense. Too simplistic.
    Again, I have not researched this, but based on Mark's quote I put in the beginning it seems..noone is being specific. It would be a fascinating area of study. How much language/singing ability is related to -general cognitive abilities-..specifics though. Are all birds that use tools better linguists? While the artist types tend to not be? Or is it the other way around? Or more likely both.

RSS feed for comments on this post