Linguistic Science and Technology in China

« previous post | next post »

I just spent a few days in China, mainly to attend an "International Workshop on Language Resource Construction: Theory, Methodology and Applications". This was the second event in a three-year program funded by a small grant from the "Penn China Research & Engagement Fund". That program's goals include "To develop new, or strengthen existing, institutional and faculty-to-faculty relationships with Chinese partners", and our proposal focused on "linguistic diversity in China, with specific emphasis on the documentation of variation in standard, regional and minority languages".

After last year's workshop at the Penn Wharton China Center, some Chinese colleagues (Zhifang Sui and Weidong Zhan from the Key Laboratory of Computational Linguistics and the Center for Chinese Linguistics at Peking University) suggested that we join them in co-sponsoring a two-day workshop this fall, with the first day at PKU and the second day at the PWCC. Here's the group photo from the first day (11/5/2017):

The growing strength of Chinese research in the various areas of linguistic science and technology has been clear for some time, and the presentations and discussions at this workshop made it clear that this work is poised for a further major increase in quantity and quality.

That trend is obviously connected to what Will Knight called "China's AI Awakening" (Technology Review, 10/10/2017):

The country is now embarking on an unprecedented effort to master artificial intelligence. Its government is planning to pour hundreds of billions of yuan (tens of billions of dollars) into the technology in coming years, and companies are investing heavily in nurturing and developing AI talent. […]

China’s AI push includes an extraordinary commitment from the government, which recently announced a sweeping vision for AI ascendancy. The plan calls for homegrown AI to match that developed in the West within three years, for China’s researchers to be making “major breakthroughs” by 2025, and for Chinese AI to be the envy of the world by 2030.

Some other media coverage: "China Is Using America’s Own Plan to Dominate the Future of Artificial Intelligence", Foreign Policy 9/8/2017; "Will The Future Of Artificial Intelligence Look Chinese?", Forbes 11/6/2017; "How China plans to beat the U.S. at technology", CNN 11/8/2017; "How China’s AI experts can beat Google and Microsoft at their own game by 2030", South China Morning Post 8/27/2017; "China's Artificial Intelligence Revolution", The Diplomat 7/27/2017; etc. …

The Chinese government's plan is well worth reading — and Google Translate does a good job of making it accessible to those who can't read Chinese.  Overall this plan strikes me as serious and well thought out, but there seems to me to be a potential tension between one aspect of the plan and the current reality. One of the plan's four "basic principles" is "Open Source" — in the automatic translation:

Advocate the concept of open source sharing, and promote the creation, sharing and sharing of all innovations in production, learning and research. Follow the law of coordinated development between economic construction and national defense and promote the two-way transformation and application of military and civilian scientific and technological achievements so as to jointly build up and share resources for military and civilian innovation so as to create a new pattern of deep integration of military and civilian development featuring all factors, fields and benefits. Actively participate in global R & D and management of artificial intelligence and optimize the allocation of innovative resources on a global scale.

This is very much like the approach followed in the U.S. over the past half century or so. But it's increasingly difficult for Chinese researchers to "Actively participate in global R & D and management of artificial intelligence and optimize the allocation of innovative resources on a global scale", given the increasingly restrictive nature of the "Great Firewall".

The Chinese plan, and the international reaction, are somewhat reminiscent of Japanese efforts in the 1980s, including the so-called "Fifth Generation Computer Project", which featured the AI technology of the time as the (software-side) foundation of its approach. Japan's apparent economic and cultural ascendency at that time was promoted by Ezra Vogel's 1979 book "Japan as Number One: Lessons for America" — but it did not end at all well, as discussed e.g. in "'Fifth Generation' Became Japan's Lost Generation", NYT 6/5/1992, "Japan as number one: Land of the setting sun", The Economist 11/12/2009, and "Looking back at 'Japan as No. 1': A rising star no longer, nation has suffered numerous setbacks since 1979", Japan Times 11/11/2010.

Though I'm no sociologist or economist, my impression is that the intellectual and economic foundations of the Chinese plan are more solid and less likely to fail.

[I should note that the "Belt and Road Initiative" is another factor promoting speech and language research in China — see "North America on the Belt and Road?", 7/16/2017.]



11 Comments »

  1. Coby Lubliner said,

    November 12, 2017 @ 9:10 am

    I have a question that might be better answered by Victor Mair: Why do Chinese academics write their name "first name first" on their English web pages? This is not how Chinese personalities are normally referred to in English media. (Jinping Xi, anyone?)

  2. Rodger C said,

    November 12, 2017 @ 12:32 pm

    I can see the Onion article: "Chinese AI initiative revealed to be guy in room."

    [(myl) For those who missed the classical allusion, the point of reference is John Searle's Chinese Room Argument. I'm not sure that The Onion, much as I appreciate it, is the right venue here — maybe Existential Comics?]

  3. Bruce said,

    November 12, 2017 @ 1:13 pm

    These days, people decide for themselves what they want their international name to be.

    For a Chinese name it might be a Cantonese or Fukien reading of their name, and the family name might come first or last, but in the final analysis it’s current practice to defer to the person themselves.

    For a western context, family name last is more convenient and is not likely to lead to confusion.

    In short, for Western purposes Chinese people may adopt a mild variant of their nane, and it may well facilitate communication with English speakers, but the decision to do so is theirs, as it should be.

  4. Victor Mair said,

    November 12, 2017 @ 3:27 pm

    Near the end of the o.p., Mark notes:

    =====

    …it's increasingly difficult for Chinese researchers to "Actively participate in global R & D and management of artificial intelligence and optimize the allocation of innovative resources on a global scale", given the increasingly restrictive nature of the "Great Firewall".

    =====

    I couldn't agree more, and this holds true for all areas and fields of science and for intellectual pursuits of all kinds. To be frank, I find that I can no longer function as a fully active scholar in China because so much of the internet is cut off there.

    Furthermore, all the blocking and filtering and censoring, etc. causes Chinese internet speed to be among the slower in the world — far inferior to that in South Korea, Sweden, Norway, Japan, the Netherlands, Denmark, the UK, Germany, Finland, France, Singapore, Switzerland, Latvia, Greece, the US, and many other countries. Not what you'd want for a country that aims to lead the world in science, technology, manufacturing, commerce, and all other areas of human endeavor.

    Speaking from personal experience — and I've written about this many times — connecting to websites and downloading in China are INCREDIBLY AGGRAVATING. Slow as molasses in January. In the first place, more than half the world's internet (the most valuable parts) is simply not available in China. If you are lucky enough — through sheer grit — to connect to even a not very esoteric site, it may take you ten or a hundred or a thousand times longer to download material than it would in the United States. Furthermore, often after you've waited for hours to download something, you'll abruptly lose your connection and have to start all over again.

    During 2011-2012, I taught at China's two best universities — Tsinghua and Peking University. I can still painfully remember many a time having to wait a whole night to download a single page or two. And don't talk to me about VPNs as a solution to the problem; China can and does block VPNs at will, and starting in February VPNs will be illegal. They already are illegal, but starting in February, the prohibition against VPNs will be strictly enforced (except for the high muckety-mucks of the Party). Even more obviously and outrageously, why should citizens and guests of a civilized country have to be subject to the additional costs and trouble of installing and using a VPN? WHAT IS AN INTERNET FOR, AFTER ALL? It's for the exchange of information, but the CCP is mortally afraid of the free flow of information.

    So far as I'm concerned, the internet in China is nearly dysfunctional, and I'm not the only person who says that. Many long-term émigrés have left China for this very reason, while numerous promising Chinese scholars prefer to work abroad (either permanently or on visits as often as possible) so as to have greater access to the internet.

    This is a very serious problem for scholars (not to mention common citizens) in China, and the situation is only getting worse under the increasingly draconian policies of Xi Jinping.

  5. Bob Ladd said,

    November 12, 2017 @ 5:10 pm

    @ Cory Lubliner:

    Hungarian academics do this too. The normal Hungarian order of names is family name first, but on academic articles written in English they use the Western European order. Come to think of it, this applies in most English-language news reports as well – the current prime minister is usually referred to as Viktor Orbán, though in Hungary he is Orbán Viktor. So I suppose the next question is why English language news reports anglicise the order of Hungarian names but not Chinese ones.

  6. John Rohsenow said,

    November 13, 2017 @ 2:23 am

    Toshiro Mefune [sp?]'s Japanese name is really Mefune Toshiro, etc.

  7. Keith said,

    November 13, 2017 @ 3:38 am

    Advocate the concept of open source sharing, and promote the creation, sharing and sharing of all innovations in production, learning and research. Follow the law of coordinated development between economic construction and national defense and promote the two-way transformation and application of military and civilian scientific and technological achievements so as to jointly build up and share resources for military and civilian innovation so as to create a new pattern of deep integration of military and civilian development featuring all factors, fields and benefits. Actively participate in global R & D and management of artificial intelligence and optimize the allocation of innovative resources on a global scale.

    All that talk of military and national defence technology and research makes me think that this intense sharing will only happen within China, and probably only between certain carefully selected populations.

    [(myl) I don't think this is accurate. In fact current Chinese actions as well as statements are generally consistent with the view that "actively participate in global R & D" really means just what it says, and that there is a genuine commitment to open data and open source software (within commercial and national security limits similar to those that apply in the U.S. and elsewhere). After all, the past 30+ years of technological development in the U.S., especially in AI and networking areas, were crucially supported by the Defense Advanced Research Project Agency, which welcomed the participation of researchers from all over the world.

    At the same time, the restritions imposed by the "Great Firewall" definitely make such active participation more difficult.]

  8. Victor Mair said,

    November 13, 2017 @ 9:00 am

    @Keith

    Good eye!

    The real intent and modus operandi of their research, as you deftly point out, are sandwiched between the feel good catchphrases "open source sharing" and "global scale". As directed and supported by the government, research in the PRC operates within a closed system and is largely cut off from outside networks.

    Even more disturbing, the heavy hand of Chinese censorship is now aggressively extending throughout the world. As you may know, China is becoming increasingly aggressive in the way it censors things both domestically and globally. Two recent cases are the blacklisting of over a thousand articles in journals published by Cambridge University Press and even more from Springer Nature. They tried to do the same thing to many other presses and journals, but there was a hue and cry from scholars, so many Western presses are refusing to do business with China.

    "A Tale of Two Publishers: Is censorship the new normal?"

    by Kevin Carrico

    https://cpianalysis.org/2017/11/08/a-tale-of-two-publishers-is-censorship-the-new-normal/amp/

    Just yesterday, a sensational case erupted:

    "SMH: Australian Publisher pulls book on communist influence over fears of China action"

    http://www.smh.com.au/national/free-speech-fears-after-book-critical-of-china-is-pulled-from-publication-20171112-gzjiyr.html

    "Australian publisher Allen & Unwin has ditched a book on Chinese Communist Party influence in Australian politics and academia, citing fear of legal action from the Chinese government or its proxies.

    "The publisher’s chief executive, Robert Gorman, says it will abandon publication of a completed manuscript by Clive Hamilton, a professor of public ethics at NSW’s Charles Sturt University, called Silent Invasion : How China Is Turning Australia into a Puppet State.

    "'We have no doubt that Silent Invasion is an extremely significant book,' Mr Gorman wrote in a confidential email to Dr Hamilton on November 8." …

    VHM: Some other publisher should pick this book up immediately and publish it for Clive Hamilton.

    "Author vows book exposing Chinese influence will go ahead after publisher pulls out: Allen & Unwin had cancelled plans to print Clive Hamilton’s Silent Invasion over fear of legal action by the Chinese government"

    by Melissa Davey@MelissaLDavey

    Sunday 12 November 2017 20.13 EST

    https://www.theguardian.com/australia-news/2017/nov/13/author-vows-book-exposing-chinese-influence-will-go-ahead-after-publisher-pulls-out

    "Prominent Charles Sturt University author and ethicist Professor Clive Hamilton says his book exposing the Chinese Communist party’s activities in Australia will still be published, despite Allen & Unwin cancelling plans to print it at the 11th hour…."

  9. J. Marshall Unger said,

    November 13, 2017 @ 4:03 pm

    Although “[t]he Chinese plan, and the international reaction, are somewhat reminiscent of Japanese efforts in the 1980s, including the so-called Fifth Generation Computer Project,” and although “the intellectual and economic foundations of the Chinese plan” may be “more solid and less likely to fail,” I do not think the comparison is very helpful. As I explained in my 1987 book _The Fifth Generation Fallacy_, the kind of AI that the Japanese were pushing in the 1980s, which Hubert Dreyfus later called “good old-fashioned AI” (GOFAI), was tightly tied to the idea that all cognition was reducible to propositional logic. It rejected the need for encyclopedic databases, the importance of human-computer interaction, and work on connectionist or neural net models that did without explicit rules of reasoning.

    Since the days of 5G, major advances have been made in both computer science and empirical studies of cognition, and I doubt there are many in the field willing to defend GOFAI as it was preached at the time by researchers like Feigenbaum and Minsky. For instance, though many thought “machine translation” was just around the proverbial corner thirty years ago, most users of translations today understand that far better results can be obtained when skilled, culturally aware translators use computer applications as tools to increase their throughput and improve its quality, and that such apps seldom need anything like the GOFAI envisioned in decades past.

    I think we have learned that the integrity and verifiability of data and unhindered electronic communications are more important to the efficiencies gained and productivity achieved by means of computers than how faithfully they have been programmed in keeping with reductionist dogma about the nature of intelligent cognition.

  10. Tiberiu Weisz said,

    November 13, 2017 @ 11:04 pm

    I have worked in China and with Chinese for 40 years, and I am not surprised to see the pressure China exercises on businesses, companies, publishers and governments to align their reporting to the dictates of the Chinese Communist Party. What I am surprised at is that governments cave in to these demands and fall in line not to offend the Chinese. I hope that the Australian author Clive Hamilton will publish his book and call the Chinese bluff. If China decides to sue him in Australian Court, it will be a rare opportunity for the many Sinologists to expose the Chinese deceptions.

  11. Victor Mair said,

    November 16, 2017 @ 11:07 pm

    Internet freedom

    China dead last out of 65 countries listed on Freedom House annual report || Worst in the world for the third year in a row

    "China cyber watchdog rejects censorship critics, says internet must be 'orderly'"

    Reuters – November 16, 2017

    https://www.reuters.com/article/us-china-cyber/china-cyber-watchdog-rejects-censorship-critics-says-internet-must-be-orderly-idUSKBN1DG1VJ

    "…On Tuesday, U.S. NGO Freedom House released an annual report ranking China last [out of 65 countries] in terms of internet freedom for the third year in a row, criticizing censorship activity targeting ethnic minorities, media and regular citizens…."

RSS feed for comments on this post · TrackBack URI

Leave a Comment