Archive for Language and computers

Annals of Artificial Stupidity

Katie Deighton, "What Can’t the Internet Handle in 2022? Apostrophes", WSJ 9/29/2022:

Sybren Stüvel is an Amsterdam-based software developer with a fairly uncommon name and a surprisingly common predicament.

As he completes the tasks of daily life, computers refuse to accept his name as valid or mangle it entirely. A credit card provider rejected his moniker, a Vancouver hotel hit bumps locating his reservation—as he stood there exhausted from a nine-hour plane trip—and an airline wouldn’t let him check into a flight. “You can imagine my stress level,” he said.

While buying insurance, he said, “They asked me to confirm that my last name is indeed Stüvel.”

Read the rest of this entry »

Comments (41)

Zero-COVID: null with a difference

In Chinese, it is called "qīng líng 清零" (lit., "clear zero").  Because the concept never made sense to me as a practical means for coping with the pandemic coronavirus called SARS-CoV-2, I wrote a post trying to understand what the Chinese authorities mean by it:  see "Dynamic zero" (5/19/22).  In that post, I discussed the problem from many different angles, including:

  1. "zero moment point" in robotics
  2. "zero-sum game" in mathematics
  3. "zero dynamics" in mathematics

If "Zero-COVID" genuinely interests / concerns you, I recommend that you spend some time on the "Dynamic zero" post.  Here I will cite only this brief passage from it:

…before it was rushed into use for the current "zero [Covid control]" policy, "qīng líng 清零" started out in literary texts as an adjective implying "lonely; lonesome; solitary; desolate".  More recently, it was employed in computing as a verb denoting "to reset; to clear the memory".  From there, it was adapted by Chinese epidemiologists in the sense of "to reduce to zero; to zero out".  That may be their goal, but it is not happening, despite their fiercest efforts at FTTIS ("Find, Test, Trace, Isolate and Support").

Not to mention mass prescription of mRNA and other medicines, plus masks.

Read the rest of this entry »

Comments (15)

Information Management and Library Science

Just out today, this is one of the longest book reviews I have ever written:

Jack W. Chen, Anatoly Detwyler, Xiao Liu, Christopher M. B. Nugent, and Bruce Rusk, eds., Literary Information in China:  A History (New York:  Columbia University Press, 2021).

Reviewed by Victor H. Mair

MCLC Resource Center Publication (Copyright September, 2022)

I am calling it to your attention because the book under review, which I will refer to here as LIIC, signals a sea change in:

1. Sinology
2. Information technology
3. Academic attitudes toward the study of language and literature

Read the rest of this entry »

Comments (4)

Biblical Hebrew Computing

Three years ago, we visited a proposal for "Classical Chinese computing" (12/19/19).  The post began thus:

Several colleagues called this article to my attention:

"Programming Language for the ancient Chinese"

Here's the introduction:

文言, or wenyan, is an esoteric programming language that closely follows the grammar and tone of classical Chinese literature. Moreover, the alphabet of wenyan contains only traditional Chinese characters and 「」 quotes, so it is guaranteed to be readable by ancient Chinese people. You too can try it out on the online editor, download a compiler, or view the source code.

The home page then goes through "Syntax", "Compilation", and "Get (Source Code; Online Editor; Reference".

Read the rest of this entry »

Comments (9)

Infinitely malleable electronic brain — software and hardware

When I was a little boy, among the gifts from my parents that I treasured most were science kits that allowed me to construct my own instrumentation and use it for various experiments and observations, e.g., microscopes, radios and other electronic circuitry, chemistry sets, ingenious language games, and so on.  (This was in the late 40s and 50s in rural Stark County, northeast Ohio, mind you, when I was between the ages of about 5 and 15.)  But my favorite of all was a box full of materials for computer construction.  It consisted of a peg board, switches, wires, screws, small nuts and bolts, metal bands and clips, batteries, little light bulbs, etc.  Please remember that this was long before personal computers were invented.

Read the rest of this entry »

Comments (15)

Electronic brain

On Facebook, this conversation thread followed from a post by Bill Benzon, commenting on his recent blog post, "Once more around the merry-go-round: Is the brain a computer?"

Read the rest of this entry »

Comments (15)

Language is not script and script is not language, part 2

[This is a guest post by Paul Shore.]

    The 2022 book Kingdom of Characters by Yale professor Jing Tsu is currently #51,777 in Amazon's sales ranking.  (The label "Best Seller" on the Amazon search-results listing for it incorporates the amusing mouseover qualification "in [the subject of] Unicode Encoding Standard".)  I haven't read the book yet:  the Arlington, Virginia library system's four copies have a wait list, and so I have a used copy coming to me in the mail.  What I have experienced, though, is a fifty-minute National Public Radio program from their podcast / broadcast series Throughline, entitled "The Characters That Built China", that's a partial summary of the material in the book, a summary that was made with major cooperation from Jing Tsu herself, with numerous recorded remarks by her alternating with remarks by the two hosts:  https://www.npr.org/podcasts/510333/throughline (scroll down to the May 26th episode).  Based on what's conveyed in this podcast / broadcast episode, I think many people on Language Log and elsewhere who care about fostering a proper understanding of human language among the general public might agree that that ranking of 51,777 is still several million too high.  But while the influence of the book's ill-informed, misleading statements about language was until a few days ago mostly confined to those individuals who'd taken the trouble to get hold of a copy of the book or had taken the trouble to listen to the Throughline episode as a podcast (it was presumably released as such on its official date of May 26th), with the recent broadcasting of the episode on NPR proper those nocive ideas have now been splashed out over the national airwaves.  And since NPR listeners typically have their ears "open like a greedy shark, to catch the tunings of a voice [supposedly] divine" (Keats), this program seems likely to inflict an unusually high amount of damage on public knowledge of linguistics. 

Read the rest of this entry »

Comments (27)

Robot philosopher-calligrapher

I was aware of this article more than four years ago when it first appeared, but didn't post on it then because I didn't think many people would be interested in it:

"Forget Marx and Mao. Chinese City Honors Once-Banned Confucian", Ian Johnson, NYT (10/18/17)


(Credit: Lam Yik Fei for The New York Times)

Now that we're on a Chinese calligraphy and philosophy roll and have a number of robot calligraphy posts under our belt (see "Selected readings" below), writing a post about a robotic philosopher-calligrapher is not so outlandish after all.

Read the rest of this entry »

Comments (2)

Is Korean diverging into two languages?, part 2

To make sense of the story that follows, one must understand that the Korean word "agassi 아가씨" used to refer to a young lady from the upper class, but now in North Korea means “slave of feudal society” and has a very negative connotation there.

"Hidden meaning of Korean term 'agassi' leads to murder", by Choi Jae-hee, The Korea Herald (5/3/22)

Because the linguistic psychology that lies behind the tragic crime recounted in this article is intricate and subtle, it is necessary to recount it at some length:

An error in a mobile translation application recently prompted a 35-year-old Chinese man in Jeongeup, North Jeolla Province, to murder a Korean resident.

Read the rest of this entry »

Comments (27)

Character amnesia yet again: game (almost) over

Last week, I witnessed a palpable, powerful, poignant demonstration of tíbǐwàngzì 提筆忘字 ("forgetting how to write sinographs; character amnesia").  This happened in a colloquium where, during the discussion period, someone mentioned the standard eight-volume Historical Atlas of China (1982-1988) edited by the renowned geographer Tan Qixiang (1911-1992).  A member of the gathering requested that the name be written on the whiteboard in sinographs.  A colleague — a tenured professor of medieval Chinese history — popped up and said they could write the name in characters.

Already a little bit wobbly on the semantophore / radical on the left side of the first character (the surname), with a little bit of kibitzing from colleagues, the volunteer managed to produce the requisite semantophore after several false starts and erasures.  After that great achievement (producing the semantophore amid much embarrassment), they turned to the phonophore on the right side but were getting nowhere fast, even with suggestions from colleagues who were looking on.

Finally, someone looked up the name on their phone and presto digito*, the correct writing emerged:  譚其驤 / 谭其骧 (the group — scholars all — collectively preferred the traditional form over the simplified one).

—–

[*VHM:  I remember hearing this expression when I was young, but it barely exists on the internet, and I can't find it in dictionaries either.]

Read the rest of this entry »

Comments (1)

Why is Facebook's Chinese translation still so terrible?

[This is a guest post by Jenny Chu]

Has Language Log been following up on the great sorrow that is Facebook's (Chinese) translation feature? The last reference I found was this one

It came up today when I was reading this somewhat viral post on Facebook

I switched on the auto-translate option to help me understand. The results were not just astonishingly bad, but had a surprisingly medical bent.

 
今天這個主權政府作承諾的時候大辭炎炎,七情上面,結果又是如何?–> "Today, when the private government is working, the weather is colon inflammation, above the sentiment, what is the result?"

Read the rest of this entry »

Comments (11)

The paradox of hard and easy

If you're interested in one-way functions and Kolmogorov complexity, you'll probably want to read this mind-crunching article:

"Researchers Identify ‘Master Problem’ Underlying All Cryptography", by Erica Klarreich, Quanta Magazine (April 6, 2022)

The existence of secure cryptography depends on one of the oldest questions in computational complexity.

To ease our way, here are brief descriptions of the two key terms:

In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems. Not being one-to-one is not considered sufficient for a function to be called one-way….

(source)

In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produces the object as output. It is a measure of the computational resources needed to specify the object, and is also known as algorithmic complexity, Solomonoff–Kolmogorov–Chaitin complexity, program-size complexity, descriptive complexity, or algorithmic entropy. It is named after Andrey Kolmogorov, who first published on the subject in 1963.

(source)

Read the rest of this entry »

Comments (17)

The weirdness of typing errors

In this age of typing on computers and other digital devices, when we daily input thousands upon thousands of words, we are often amazed at the number and types of mistakes we make.  Many of them are simple and straightforward, as when our fingers stumblingly hit the wrong keys by sheer accident.  People who type on phones warn their correspondents about the likelihood that their messages are prone to contain such errors because they include some such warning at the bottom: 

Please forgive spelling / grammatical errors; typed on glass // sent from my phone.

Read the rest of this entry »

Comments (37)