OpenAI's Chinese problem
We have expressed concern over the quality of training and source materials for Chinese AI and LLMs. Less than a week ago, we examined "AI based on Xi Jinping Thought (5/21/24), which may be considered as an attempt to "purify" what goes into Chinese AI. It turns out that there really is a problem, and it is affecting not just China's own AI efforts, but is infecting ours as well.
OpenAI’s latest blunder shows the challenges facing Chinese AI models:
Finding high-quality data sets is tricky because of the way China’s internet functions.
By Zeyi Yang, MIT Technology Review (May 22, 2024)
As we shall soon see, pursuing this topic takes us into very sensitive, disquieting territory concerning the nature of China's internet. It will be difficult for us to avoid assessing the quality of China's knowledge basis and information resources overall.
Here are the opening paragraphs of the MIT Technology Review article by Zeyi Yang:
Read the rest of this entry »