Elephant imitates Korean

« previous post | next post »

Stoeger et al., "An Asian Elephant Imitates Human Speech", Current Biology (2012):

Vocal imitation has convergently evolved in many species, allowing learning and cultural transmission of complex, conspecific sounds, as in birdsong. Scattered instances also exist of vocal imitation across species, including mockingbirds imitating other species or parrots and mynahs producing human speech. Here, we document a male Asian elephant (Elephas maximus) that imitates human speech, matching Korean formants and fundamental frequency in such detail that Korean native speakers can readily understand and transcribe the imitations. To create these very accurate imitations of speech formant frequencies, this elephant (named Koshik) places his trunk inside his mouth, modulating the shape of the vocal tract during controlled phonation. This represents a wholly novel method of vocal production and formant control in this or any other species. One hypothesized role for vocal imitation is to facilitate vocal recognition by heightening the similarity between related or socially affiliated individuals. The social circumstances under which Koshik’s speech imitations developed suggest that one function of vocal learning might be to cement social bonds and, in unusual cases, social bonds across species.

Here's Figure 1, whose legend reads:

Spectral Comparison of the Speech Utterance “nuo”: Spectrograms exemplifying the speech utterance “nuo” of the trainer (A and D) compared to the elephant’s (Koshik) imitation (B and E) and a 40-year-old male Korean native speaker (C and F) with no experience of Koshik’s Korean output (recorded via a head set and thus with higher recording quality than the other two sound samples). (A–C) represent narrow band spectrograms of “nuo” and (D–F) give wide-band spectrograms of each “nuo” utterance, respectively. The fundamental frequency (fund. freq.) and the first and the second formant (F1 and F2) are indicated.

Some audio examples — in each case, the trainer says a word and then Koshik imitates it:

annyong ("hello")

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

anja ("sit down")

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

nuo ("lie down")

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

choah ("good")

Audio clip: Adobe Flash Player (version 9 or above) is required to play this audio clip. Download the latest version here. You also need to have JavaScript enabled in your browser.

The other word in his vocabulary of imitation is aniya "no".

The fidelity of Koshik's reproductions is not as good as this method of presentation may make you think — when you know what a sound is supposed to be, your expectations make an attempt to imitate it sound more accurate, an effect noted by Solzhenitzyn in his description of (poor quality) vocoder testing in The First Circle:

Koshik’s speech sound repertoire was said by his trainers to comprise six Korean words. We tested this hypothesis by analyzing transcriptions made by 16 Korean native speakers on 47 recordings of Koshik’s utterances (see Table S1 available online). The subjects were not informed about the supposed spelling or meaning of the imitations. This analysis largely confirmed the trainers’ claims, indicating that Koshik’s speech imitations correspond to the following five words: “annyong” (“hello,” Audio S1), “anja” (“sit down,” Audio S2), “aniya” (“no”), “nuo” (“lie down,” Audio S3), and “choah” (“good,” Audio S4). Agreement was high for vowels and relatively poor for consonants: vowel transcription similarity was 67% overall, whereas consonant agreement only reached 21% (Table S1). For example, “choah” utterances (according to trainers) were mainly transcribed as “boah” (“look,” 38%) or “moa” (“collect,” 23%), but neither of these utterances was used toward Koshik. As a result, transcriptions provided exact spelling matches (in Korean) for only one sound (“annyong,” “hello,” for which the majority of respondents [56%] agreed) and three additional imitations for which considerable agreement could be documented (“aniya”: 44%; “nuo”: 31%; “anja”: 15%). These results show that Koshik accurately imitates vowels, determined by formant frequency matching, but that consonant fidelity is relatively poor.

Here's a video showing Koshik producing several repetitions of "choah", illustrating the trunk-in-mouth technique of formant manipulation:

This case suggests that elephants must be added to the species known to be capable in principle of vocal learning; and the authors speculate that the application of this ability to the imitation of human speech has social and emotional roots:

Although elephants living under human care may be heavily exposed to speech from birth on, they do not imitate speech on a regular basis. Thus, early intensive speech exposure does not seem adequate to initiate speech imitation in elephants (although it might be a required precondition), as long as they are embedded within an elephant social environment. Koshik was captive-born in 1990 and translocated to Everland in 1993, where two female Asian elephants accompanied him until he was five years old. From 1995 to 2002, Koshik was the only elephant in Everland. He was trained to physically obey several commands and was exposed to human speech intensively by his trainers, veterinarians, guides, and tourists. In August 2004, his trainers first noticed that Koshik imitated speech. We cannot be certain whether Koshik started to produce speech sounds at 14 years of age (near the onset of Koshik’s sexual maturity; his first musth period occurred in March 2005) or whether earlier imitations went unrecognized by his trainers. However, the determining factors for speech imitation in Koshik may be social deprivation from conspecifics during an important period of bonding and development when humans were the only social contact available (this hypothesis may also hold for other known examples of speech imitation in mammals, Hoover the seal and the beluga Logosi, and also most talking birds.

Some mass media uptake for this story: "Loneliness 'forced elephant to speak Korean'", ABC News 11/1/2012; Rebecca Morelle, "Elephant mimics Korean with help of his trunk", BBC News 11/1/2012.

Past LL posts about elephant vocal learning or speech imitation, noted by various commenters: "Elephant talk", 4/3/2005; "Batyr", 10/11/2008.

[ht Shermin de Silva]



14 Comments

  1. Avinor said,

    November 2, 2012 @ 8:07 am

    Now just waiting for "Elephant learns hangeul"!

    [(myl) Indeed. Not sure why this obvious intellectual niche is apparently still empty.]

  2. Ray Girvan said,

    November 2, 2012 @ 8:23 am

    For those who missed it: see, previously at LL, Batyr.

  3. Boudica said,

    November 2, 2012 @ 8:28 am

    I guess the movie should have been Planet of the Elephants. I bow before our future overlords!

  4. Dan Lufkin said,

    November 2, 2012 @ 8:28 am

    There used to be a harbor seal named Andre (q.G.) who migrated up and down coastal Maine. Andre could imitate the local vowels and intonation pattern to perfection. He sounded like a lobsterman having an argument just out of earshot.

    We had a collie x shepherd who, on request, could imitate a turkey (wabba-wabba), cow (mooooo), say "Oreo" (oo-ee-ohhh), uh-oh, etc. Once she got the basic idea of imitation, she'd work until she got it right.

  5. Ray Girvan said,

    November 2, 2012 @ 8:45 am

    @ Dan Lufkin > Andre

    That sounds like Hoover ("Oyoyoyoy, come over here, hellooo, hahahaha, oo-arrrr!") – sound files here.

  6. zythophile said,

    November 2, 2012 @ 9:07 am

    Larry Pournelle and Jerry Niven, of course, anticipated our future pachydermous overlords in Footfall, though they failed to say, IIRC, if the Fithp spoke by using their trunks down their mouths.

  7. Theodore said,

    November 2, 2012 @ 9:54 am

    "Beluga Logosi"? Did he speak Hungarian?

  8. Mark Etherton said,

    November 2, 2012 @ 10:38 am

    Craig Brown comments on this and a similar story about a talking whale here: http://www.dailymail.co.uk/debate/article-2225983/Talking-Beluga-Really-clever-whales-dont-spout-English.html

  9. KWillets said,

    November 2, 2012 @ 10:54 am

    "consonant fidelity is relatively poor"

    They obviously need some kind of trunk-in-mouth Hangeul to replace the usual consonant formations.

  10. Bob Kennedy said,

    November 2, 2012 @ 12:04 pm

    I'm reminded of this LL post (from 2005) about Mlaika, the Kenyan elephant who would imitate trucks.

  11. Jerry Friedman said,

    November 2, 2012 @ 1:44 pm

    Apparently elephants can write in Thai if someone holds their ear.

  12. J. Goard said,

    November 4, 2012 @ 9:31 am

    These results show that Koshik accurately imitates vowels, determined by formant frequency matching, but that consonant fidelity is relatively poor.

    Korean has vowel harmony, at least for the familiar speech level endings added to short verbs in all but the first of these examples. When Korean speakers judge the elephant's production in the fourth example as various -o-a verbs, it's not nearly as significant as it might seem, since in contemporary Korean only two vowels ([o] and [a]) take the [a] form of the suffix, and moreover, there are only two forms that the suffix can be ([a] and [ʌ]).All the elephant had to do with respect to a native Korean speaker, is to make its drawn out final vowel closer to [a] than [ʌ], and its first vowel just different enough from [a] (in the direction of [o]) that it wouldn't sound exactly the same.

  13. Jongseong said,

    November 5, 2012 @ 8:03 pm

    @J. Goard: Korean vowel harmony is largely a historical remnant and is no longer productive, restricted to a small number of cases, the most important of which concerns verbs with monosyllabic 'o' or 'a' stems. My native speaker intuition is that they are learnt lexically, much as English speakers learn irregular verbs. We would not hear one vowel and automatically fill in the other according to this 'rule'; in any case, vowel harmony does not apply in general so you would already have to have decided that you were listening to a specific kind of verb form, not something like "uwa" 우와 (a common exclamation) or "chowon" 초원 (grassland).

    In any case, listening to the recordings, each of the vowels are quite recognizable on their own. /ʌ/ is a bit weird in the third example, but then /ʌ/ has quite a varied realization in Korean anyway. I hear a clear /o/ in the fourth example, not just a vowel "just different enough from /a/ (in the direction of /o/)".

    And just because all the ad hoc transcriptions in the article are bothering me:
    annyeong 안녕 /an.njʌŋ/
    anja 앉아 /an.dʑa/
    nuwo 누워 /nu.wʌ/
    joa 좋아 /tɕo.a/
    aniya 아니야 /a.ni.ja/

  14. Saffron Eaglepillow III said,

    November 7, 2012 @ 8:36 pm

    I have to say that, having spent a lot of time listening to Korean speakers over the last couple of decades, the most uncanny thing for me about this elephant is that, while the words don't necessarily sound to me like what they're supposed to be (annyeong, particularly), something about the tonality of the sounds is really Korean-sounding to my native English ear.

RSS feed for comments on this post