Speech to speech translation of unwritten languages: Hokkien

« previous post | next post »

Everybody's talking about it.

"Meta has developed an AI translator for a primarily-spoken language

It only translates between Hokkien and English for now, but offers potential for thousands of languages without official written systems."

By Amanda Yeo, Mashable (October 20, 2022)

If true, this technology could be an enormous boon for illiterates everywhere.  It also has important theoretical and linguistic implications.

Meta has developed an artificially intelligent translation system that can convert an oral language — Hokkien — to spoken English. It's another step closer to finally making Star Trek's Universal Translator a reality.

Translation AI is typically trained on text, with researchers feeding their systems reams of written words for them to learn from. However, there are over 3,000 languages that are primarily spoken and have no widely used written system, making them difficult to incorporate into such training.

Hokkien is one such language. Used by over 45 million people across mainland China, Taiwan, Malaysia, Singapore, and the Philippines, Hokkien is an oral language without an official, standard written system.

As such, Hokkien speakers who need to write down information tend to do so phonetically, resulting in significant variation depending on the writer. There is very little recorded data translating Hokkien to English either, and professional human translators are scarce.

To work around this, Meta used written Mandarin as an intermediary between English and Hokkien when training its AI.

Nota bene:  The training of the Meta AI translator is not accomplished directly from spoken English to spoken Hokkien and vice versa, rather it proceeds thus:

spoken English < > written Mandarin < > spoken Hokkien

"Our team first translated English or Hokkien speech to Mandarin text, and then translated it to Hokkien or English — both with human annotators and automatically," Meta researcher Juan Pino said. "They then added the paired sentences to the data used to train the AI model."

Of course, filtering a sentence through multiple languages can sometimes distort its meaning — as anyone who has ever played around with Google Translate knows. Meta also worked with Hokkien speakers to check translations, and is releasing its models, data, and research as open-source information for other researchers to utilise.

Despite the limitations, the Meta speech to speech AI translator represents a big step forward in extending the range of translation among spoken languages.  Still, as an ardent advocate of oral languages, I find it ironic that, at this stage, it is through the intermediary of written language that this marvel is being enabled.

Selected readings

[h.t. Jean DeBernardi, Gene Hill, and Grace Wu]



  1. Philip Taylor said,

    October 21, 2022 @ 3:31 pm

    Victor [finds] "it ironic that, at this stage, it is through the intermediary of written language that this marvel is being enabled". Perhaps naïvely, I do not. Unless and until we routinely interact with computer and AI systems solely through oral/aural means, the use of written language as intermediary is, it seems to me, the most natural approach. Perhaps my mind is too closed, perhaps I am too familiar with interacting with a computer through keyboard and monitor to conceive of a purely oral/aural interface, but even if I had such an interface at my disposal I would still express myself using words (and computer notation), and the idea that these might exist within the computer only as digitised sounds and not as (the computer's standard internal representation of) text is an idea I find almost impossible to grasp.

  2. Jonathan Smith said,

    October 21, 2022 @ 6:12 pm

    Wait, riry? Facebook (or Google, or anyone) applies existing tech in an obvious and productive manner? Color me surprised… the below is my Tâi-lô/English transcription of the relevant portion; edits welcome…

    Peng-Jen: Lí-hó, Mark. Lí kám tsai-iánn lán ê thuân-tuī tá-tsō tē-it-ê tsi-tshî kháu-gí gí-giân ê huan-i̍k hē-thóng?

    [Hi, Mark. Do you know that our team created the first translation system to support a spoken language?]

    Mark: Yeah, this is great. Hokkien is spoken by millions of people, but since there’s no standard writing system, that makes it pretty challenging to build a translation system like this.

    [Tse si̍t-tsāi sī tsin tsán! Ū sóo-pah-bān láng kóng Hok-kiàn-uē, tān-sī pīng bô tsit-ê piau-tsún ê su-siá hē-thóng. Tse hōo tá-tsō huan-i̍k hē-thóng piàn kah tsiânn khùn-lân.]

    Peng-Jen: Bô-m̄-tio̍h. Tī guá sè-hàn ê sî-tsūn tī ha̍k-hāu bô leh kà Hok-kiàn-uē. I sī thàu-kuè kháu-gí tsit-tāi tsit-tāi thuân–lo̍h-lâi-ê.

    [That’s right. Hokkien was not taught in schools when I was a kid. And it is passed down orally from generation to generation.]

    Mark: Do you speak it with your kids?

    [Lí kám-ū tuì lí-ê gín-á kóng?]

    Peng-Jen: Ū a! Koh-ū guá-ê pē-bú.

    [Yes. And my parents.]

    Hard to believe this is actually a "live" demo, but impressive even if an idealization. Even the tone sandhi is good if not perfect :D The caveat is of course "but why make any use of Mandarin at all"… oh, and of course, why fudge with "Hokkien" since this is the language of (much of) Taiwan (& Amoy) specifically… but all is forgiven if this is at all as advertised.

  3. Jerry Packard said,

    October 22, 2022 @ 3:29 pm

    Wow, good work.

RSS feed for comments on this post