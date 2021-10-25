« previous post |

New article in EnterpriseAI (October 21, 2021):

"Language Model Training Gets Another Player: Inspur AI Research Unveils Yuan 1.0", by Todd R. Weiss

From Pranav Mulgund:

This article introduces an interesting new advance in an artificial intelligence (AI) model for Chinese. As you probably know, Chinese has been long held as one of the hardest languages for AI to crack. Baidu and Google have both been trying for a long time, but have had a lot of difficulty given the complexity of the language. But the company Inspur just came out with a model called Yuan 1.0 that shows significant advances from previous companies' AIs.

Notable quotes from this article:

"…Yuan 1.0 scored almost 20 percent better on Chinese language benchmarks and took home the top spot in six categories, such as noun-pronoun relationships, natural language inference, and idiom reading comprehension…."

"In the process of creating Yuan 1.0, Inspur built the most comprehensive Chinese language corpus in the world, more than twice the size of the largest existing Chinese corpus, and used all 5TB of it to train this new model."

"There is a lot of interest in big models, but we should expect a series of similar announcements for a while, approaching 1 trillion parameters," said [Karl] Freund. "But soon, it will take a different hardware and software approach, something like Nvidia Grace or Cerebras MemoryX, to scale to 100 trillion parameters for brain-scale AI."

Ultimately, though, one must ask if there is a market for these innovations…. "We think so, but it is just emerging," said Freund. "The models to date are error-prone and can promote bias and misinformation. So, the use of these models remains a bit scary."

The refractory nature of Chinese poses a unique challenge to developers of language model training. On the other hand, some of the experience gained in attempting to cope with Chinese provides information and data that are valuable for improving models for other, less complex languages. However, one thing that worries me about all of these models they're talking about is that they are BIG. We have seen that problem of scale all along in the computerization and digitization of Chinese, from Unicode to the big models for AI discussed in this article. In the past, when I raised these issues, pollyannaish people always said to me, "Don't worry, memory is cheap." But when the size of the computer resources required for these huge projects becomes truly colossal, surely cost must enter as a significant factor, which causes one to wonder whether they are practical for actual use and financially realizable for other than experimental, theoretical purposes.

Selected readings

[h.t. Bill Benzon]

Permalink