Meet Habibi – the Chinese AI uniting 20 Arabic dialects in a Middle East first

Lead author says there are many differences between Arabic dialects and Modern Standard Arabic, which is used in official circumstances

Zhao Ziwen, SCMP, 28 Feb 2026

The paper that presents this new model is called “Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis”. It was published last month on arXiv, an open-access repository that is not peer-reviewed. I will be interested to hear what Language Log readers think of its prospects.

Led by Shanghai Jiao Tong University’s X-LANCE Lab – one of China’s top audiovisual and language processing research entities – the model is named Habibi, meaning “my dear” in Arabic.

In presenting their findings, the research team spearheaded by Chen Yushen described the project in a paper as “the first open-source framework for unified-dialectal Arabic speech synthesis”.

They introduce a concept that is new to me: "zero-shot".

Habibi has the “zero-shot” ability, meaning the model can easily clone a voice by using just a short reference audio clip, without prior explicit or extensive training. This allows applications in highly efficient and on-the-fly scenarios.

According to Wikipedia,

Zero-shot learning (ZSL) is a problem setup in deep learning where, at test time, a learner observes samples from classes which were not observed during training, and needs to predict the class that they belong to. The name is a play on words based on the earlier concept of one-shot learning, in which classification can be learned from only one, or a few, examples.

Zero-shot methods generally work by associating observed and non-observed classes through some form of auxiliary information, which encodes observable distinguishing properties of objects. For example, given a set of images of animals to be classified, along with auxiliary textual descriptions of what animals look like, an artificial intelligence model which has been trained to recognize horses, but has never been given a zebra, can still recognize a zebra when it also knows that zebras look like striped horses. This problem is widely studied in computer vision, natural language processing, and machine perception.

A zebra can be identified as looking like a striped horse, even if you've never seen a zebra before





In case you're interested, "Habibi" itself is an Arabic word worth learning in one of its 20 plus topolects: Syrian, Egyptian, Jordanian, Levantine…. Because of its wide range of meanings, nuances, and usages, be careful of how, when, and to whom you use it.

