Multilingualism in Philadelphia's Chinatown

Sign spotted by Diana Shuheng Zhang on December 7, 2019:

Read the rest of this entry »

Comments (1)


Communicative disfluencies interpolations

In the past few days, I've encountered some nice examples of the communicative interpretation of what I've suggested we ought to call "interpolations" rather than "disfluencies".

Read the rest of this entry »

Comments (4)


"National Language" in the Xinjiang Uyghur Autonomous Region

Many people have been asking me about the use of the term Guóyǔ 国语 ("National Language") for "Mandarin" in Xinjiang today.  Here's an inquiry from Peter Moody:

I have encountered what seems to be an anomaly in contemporary Chinese usage, and have been assured that you are among those most capable of addressing it.

I was reading an analysis by a Darren Byler, a "Xinjiang Scholar," of a 2017 classified directive from Zhu Hailun, Gauleiter of Xinjiang, on how properly to run the concentration camps in that territory (https://supchina.com/2019/12/04/a-xinjiang-scholars-close-reading-of-the-china-cables/). (I have not looked either at the full English translation of these directives, or the Chinese text, although both are available. I figured the analysis would give the gist of them.)

Read the rest of this entry »

Comments (10)


Amazing new Japanese words

These come from the following nippon.com article:

"Pay It Forward: The Top New Japanese Words for 2019" (12/13/19)

I'll list the words first, then explain which one is my favorite.

A prefatory note:  nearly half of the words on these lists are based wholly or partly on borrowings from English, though they are assimilated into Japanese in such a manner that they are unrecognizable to monolingual English speakers.

Read the rest of this entry »

Comments (6)


Seeding Mars

Comments (3)


D for Dog, L for Love

When confirming reservations on the phone with clerical folks in certain southeast Asian countries, Paul Midler noticed they often used variations of the NATO phonetic alphabet. "D for Dog" and "L for Love" seemed to be a couple consistent additions. Passing through a travel agency in Thailand, he saw this:

Read the rest of this entry »

Comments (31)


Quantum Supremacy

For the past couple of months, the phrase "Quantum Supremacy" has been on my to-blog list, based on points and counterpoints like "Google scientists say they've achieved 'quantum supremacy' breakthrough over classical computers" (WaPo 10/23/2019) and "IBM Says Google's Quantum Leap Was a Quantum Flop" (Wired 10/21/2019). My interest, at least on the LLOG dimension, was not in the argument about how difficult a particular problem is for classical computers, but rather in the use of the word supremacy.

Now I can take this one off the stack, because a recent SMBC does a better job than I would have:


Read the rest of this entry »

Comments (8)


Please stoop

Photograph from Paul M in Taipei:

Read the rest of this entry »

Comments (3)


be;eza

Sign on the front of a fashion store (shoes and handbags) in Taipei:

Read the rest of this entry »

Comments (3)


Exotic letter in Taipei

Paul M. sent in this photograph of the front of a fashion shop on Yongkang Street, Da'an District, Taipei City, Taiwan:

Read the rest of this entry »

Comments (5)


Cat chat and tax talk

Photograph of a campaign billboard in Taiwan showing President Tsai Ing-wen, who is up for reelection on January 11, with one of her two beloved cats:


(Source: anonymous colleague)

Read the rest of this entry »

Comments (2)


AI is brittle

Following up "Shelties On Alki Story Forest" (11/26/2019) and "The right boot of the warner of the baron" (12/6/2019), here's some recent testimony from engineers at Google about the brittleness of contemporary speech-to-text systems: Arun Narayanan et al., "Recognizing Long-Form Speech Using Streaming End-To-End Models", arXiv 10/24/2019.

The goal of that paper is to document some methods for making things better. But I want to underline the fact that considerable headroom remains, even with the massive amounts of training material and computational resources available to a company like Google.

Modern AI (almost) works because of machine learning techniques that find patterns in training data, rather than relying on human programming of explicit rules. A weakness of this approach has always been that generalization to material different in any way from the training set can be unpredictably poor. (Though of course rule- or constraint-based approaches to AI generally never even got off the ground at all.) "End-to-end"  techniques, which eliminate human-defined layers like words, so that speech-to-text systems learn to map directly between sound waveforms and letter strings, are especially brittle.

Read the rest of this entry »

Comments (6)


The impact of phonetic inputting on Chinese languages

The vast majority of people, both inside and outside of China, input characters on cell phones, computers, and other electronic devices via Hanyu Pinyin or other phonetic script.  Naturally, this has had a huge impact on the relationship between users of the Chinese script and their command of the characters, since they are no longer directly writing the characters through neuro-muscular coordination and effort.  Instead, their electronic devices do the writing of the characters for them by converting the Pinyin or other phonetic inputting to the desired characters, resulting in the widely lamented phenomenon of "character amnesia", which we have touched upon in dozens of LL posts.

There has in recent years been a lot of stuff and nonsense bandied about concerning how Chinese character inputting led to the development of predictive typing, whereas the actuality is that the extreme cumbersomeness of the Chinese writing system necessitated the development of one kind of predictive typing (other predictive algorithms were already in use long before) to rescue the characters from hasty extinction.

Read the rest of this entry »

Comments (54)