Archive for March, 2022

Pinyin in subtitles

Read the rest of this entry »

Comments (14)

What3words again

A friend's note:

https://what3words.com/

is an app that assigns a three-word combination to every 3-meter square in the world.

My dad's living room is at acid.tribe.dwell …. ;-)

Read the rest of this entry »

Comments (16)

Sentence length and syntactic complexity

[This is a guest post by Don Keyser, in response to "Trends" (3/27/22).]

I do hope Sir Walter Scott is part of the study, as an outlier perhaps.  I still have nightmares going back to English class in an era when one still was obliged to diagram the sentences to establish to the satisfaction of the teacher that one truly and fully grasped the structure and meaning.  Sir Walter Scott's Ivanhoe was the acid test.  I'm not sure blackboards of the era were sufficiently large, or chalk sufficiently sturdy, to get through the diagram of a single sentence in Ivanhoe and other works.

I just checked online and found that there are free versions of Ivanhoe in ebook and .pdf format.

Some examples of all too typical sentences from that work:

On the other hand, such and so multiplied were the means of vexation and oppression possessed by the great Barons, that they never wanted the pretext, and seldom the will, to harass and pursue, even to the very edge of destruction, any of their less powerful neighbours, who attempted to separate themselves from their authority, and to trust for their protection, during the dangers of the times, to their own inoffensive conduct, and to the laws of the land.

Read the rest of this entry »

Comments (21)

Metered spellings

Comments (38)

Old Ukrainian windmills and Old Sinitic reconstructions

VHM somewhere in Ukraine, probably late summer 2002:

Read the rest of this entry »

Comments (14)

Trends

About six weeks from now, I'm scheduled to give a (virtual) talk with the (provisional) title "Historical trends in English sentence length and syntactic complexity". The (provisional) abstract:

It's easy to perceive clear historical trends in the length of sentences and the depth of clausal embedding in published English text. And those perceptions can easily be verified quantitatively. Or can they? Perhaps the title should be "Historical trends in English punctuation practices", or "Historical trends in English conjunctions and discourse markers." The answer depends on several prior questions: What is a sentence? What is the boundary between syntactic structure and discourse structure? How is message structure encoded in speech (spontaneous or rehearsed) versus in text? This presentation will survey the issues, look at some data, and suggest some answers — or at least some fruitful directions for future work.

So I've started the "look at some data" part, so far mostly by extending some of the many relevant earlier LLOG Breakfast Experiment™ explorations, such as "Inaugural embedding", 9/9/2005, or  "Real trends in word and sentence length", 10/31/2011, or "More Flesch-Kincaid grade-level nonsense", 10/23/2015. 

In most cases, the extensions just provide more data to support the ideas in the earlier posts. But sometimes, further investigation turns up some twists.

Read the rest of this entry »

Comments (15)

Vicious smears, part 2

The CCP's favorite word for characterizing opinions with which they disagree seems to be "smear", which I wrote about here:  "Vicious smears" (9/10/20).

Recently, for whatever reason, we now have a plentiful new crop of "smearisms" in official Chinese media, for examples of which see here, here, here, here, and here (all from Global Times, CCP's major ideological mouthpiece, whose Chinese and English versions have since 2009 been under the editorship of the formidable firebrand, Hu Xijin; in recent months Hu has repeatedly said that he would be stepping down as editor-in-chief of GT, but, judging from his still frequent interventions, he evidently continues to wield enormous power in the propaganda apparatus).

Read the rest of this entry »

Comments (13)

From Rusyn / Ruthenian and Ukrainian, and on to Russian

[This is a guest post by Don Keyser, responding to Grant Newsham's "Rusyn" (3/22/22)]

This one brought back memories.

In 1959, my high school in Towson, just to the north of Baltimore, rose to the challenge posed by Sputnik and launched a Russian-language program. I had studied Latin for three years, and when invited to "enlist" (as a patriotic duty) in study of the enemy's language, I was delighted to abandon Latin … for my country, and otherwise. So I took two years of Russian in high school, and went on to study Russian language and Russian/Soviet area studies through undergrad and M.A. work. I only "defected" to Chinese/Japanese in PhD studies and thereafter in the U.S. government.

Anyway … my very first Russian language teacher was named Josef Glus. He had been teaching wood shop*, of all things, to kids not expected to go on to university. But he spoke Russian, and was tapped to teach the maiden course in that language offered by the high school. He was Ruthenian. I had to look up Ruthenia — in the days before a few taps of the fingers on a computer yielded up a map, the history, and so on.

[*VHM: For the concept of "shop" in the high school curriculum, see "The weirdness of typing errors" (3/14/22)]

Read the rest of this entry »

Comments (3)

Remarkable Ukrainian-Scottish speaker

Robert Shackleton sent in a link to this BBC Ukrainecast episode from 14 March, with the comment

Very distressing to listen to the interview, but also an interesting example of a native Slavic language speaker who has near-perfect Ayrshire speech.

The referenced interview starts at 6:20 in the BBC podcast — I've reproduced it below for convenience, and for protection against future bit rot:

Comments (8)

The semantics, grammar, and pragmatics of "drink tea" in the PRC

Tea is a Very Big Thing with me.  I am intensely interested in all manifestations and transformations of this celestial ichor.  For some references, see the "Selected readings" below.

All the tea in China is on my mind this morning as a result of reading this article:

"Defying China’s Censors to Urge Beijing to Denounce Russia’s War", by Chris Buckley (March 18, 2022)

In the midst of an account of numerous individuals who had signed a petition against Russia's war on Ukraine, I came upon this sentence:

“Every single one was taken for tea,” Mr. Lu said in a telephone interview, using a common euphemism referring to being questioned by the police.

Read the rest of this entry »

Comments (18)

Strange tales and labiovelar transcriptions

East Asians have been addicted to strange stories for millennia.  Many of these fall under the rubric of guài 怪 ("strange"), e.g., zhìguài 志怪 ("records of anomalies"), the name of one of the earliest genres of strange stories in China.

One of the strangest aspects about East Asian strange tales is that perhaps the most famous collection of all was written by a Westerner, Lafcadio Hearn (1850-1904).

Read the rest of this entry »

Comments (39)

Tortured phrases: Degrading the flag to clamor proportion

Guillaume Cabanac, Cyril Labbé & Alexander Magazinov, "'Bosom peril' is not 'breast cancer': How weird computer-generated phrases help researchers find scientific publishing fraud", Bulletin of the Atomic Scientists, 1/13/2022:

In 2020, despite the COVID pandemic, scientists authored 6 million peer-reviewed publications, a 10 percent increase compared to 2019. At first glance this big number seems like a good thing, a positive indicator of science advancing and knowledge spreading. Among these millions of papers, however, are thousands of fabricated articles, many from academics who feel compelled by a publish-or-perish mentality to produce, even if it means cheating. […]

We have been able to spot fraudulent research thanks in large part to one key tell that an article has been artificially manipulated: The nonsensical “tortured phrases” that fraudsters use in place of standard terms to avoid anti-plagiarism software. Our computer system, which we named the Problematic Paper Screener, searches through published science and seeks out tortured phrases in order to find suspect work. While this method works, as AI technology improves, spotting these fakes will likely become harder, raising the risk that more fake science makes it into journals.

As of January 2022, we’ve found tortured phrases in 3,191 peer-reviewed articles published (and counting), including in reputable flagship publications.

See also (by the same authors) "Tortured phrases: A dubious writing style emerging in science. Evidence of critical issues affecting established journals", arXiv.org 7/12/2021.

Read the rest of this entry »

Comments (30)

Rusyn

[This is a guest post by Grant Newsham]

My mother was Rusyn. (Carpatho-Rusyn, Ruthenian, Lemko [in Poland]).  Originating in a small village, Volica, up in today's northeast Slovakia — though she grew up in coal country near Pittsburgh.  Her first language was Rusyn — but I don't think she really knew exactly what language it was until much later in life.  They had no real sense of nationhood.  She said she spoke 'Russian' — but referred to it as just 'Kitchen Russian' — or some inferior form of Russian.  I think it did kind of bother her – thinking that she was a hillbilly of sorts and speaking uneducated Russian.

However, the language is basically Ukrainian (with some differences) — so close that the Ukrainians don't consider it, or the Rusyns, as distinct entities.  After the communists were overthrown, the Slovak government allowed Rusyn nationality (and have set up some Rusyn-language schools [a cousin teaches at one]) and you'll see signs in Rusyn, but the Ukrainians still do not.  My grandfather was very clear that they were not Ukrainians.

Read the rest of this entry »

Comments (45)