Archive for December, 2019

Living fossils: Taiwan tea and salmon

Two articles in Chinese (here and here) recently brought news of an indigenous type of tea and referred to it as a rare type of salmon.  Trying to figure that out led to two linguistic puzzles:

1. Making sense of the unusual name for the salmon:  yīnghuā gōu wěn guī 櫻花鉤吻鮭 (lit., "cherry-hook-kiss / mouth-salmon"; i.e., the Formosan landlocked salmon).

2. Understanding how, even metaphorically, a kind of tea would be referred to as a type of salmon.

Read the rest of this entry »

Comments (4)

Throes?

"Dave Barry's Year in Review 2019"

… which begins with the federal government once again in the throes (whatever a “throe” is) of a partial shutdown, which threatens to seriously disrupt the lives of all Americans who receive paychecks from the federal government. 

Consulting the OED on throe (entry updated 2017), we learn that its orthographic history is interesting:

Of uncertain origin. Perhaps a variant or alteration of another lexical item. […]
The range of forms attested for this word is difficult to account for. […]
The current standard spelling throe […] is a 16th-cent. alteration of throw, throwe […] (compare with similar alteration the current forms of roe (earlier row , rowe ), hoe (earlier how , howe ), etc.), perhaps motivated by a desire to differentiate this word from throw.

Read the rest of this entry »

Comments (9)

New Years party themes

Today's xkcd:

The mouseover title: ""Off-by-one errors" isn't the easiest theme to build a party around, but I've seen worse."

Read the rest of this entry »

Comments (10)

An 8th-century Chinese epitaph written by a Japanese courtier

Here's news of a remarkable discovery:

"Ancient Chinese epitaph penned by Japanese found in China", THE ASAHI SHIMBUN (December 26, 2019 at 19:00 JST).

The article includes a photograph of a rubbing of the last line of the epitaph with the following kanji:

日本國朝臣備書

I can read that easily as Sino-Japanese "Nihonkoku chōshin Bi sho", which would mean "written by the Japanese courtier [Ki]bi".  The article says that the last line of the epitaph reads “Nihonkoku Ason Bi Sho", so it would appear that I am reading "朝臣" incorrectly as "chōshin" instead of as "ason".

Read the rest of this entry »

Comments (5)

Meanest pun of the year

From "Who's Bill This Time", Wait Wait…Don't Tell Me! 12/21/2019:

Peter Sagal: Mayor- Mayor Pete has been getting some heat.
I don't know if you saw this.
He attended a big fundraiser in Napa
at a winery with a, quote, "wine cave."
And everybody was so mad that he did this.
But why would you be mad about a wine cave?
It celebrates the two things Democrats are known for, whining and caving.

 

Comments (5)

Sweethoney dessert

Maidhc Mac Roibin sent in this photograph of the front of a dessert shop in Cupertino from Fintano's flickr site:

201908-PSP-R4-33 Sweethoney Dessert, SJ CA

Read the rest of this entry »

Comments (11)

Standardized Project Gutenberg Corpus

Martin Gerlach and Francesc Font-Clos, "A standardized Project Gutenberg corpus for statistical analysis of natural language and quantitative linguistics", arXiv 12/19/2018:

The use of Project Gutenberg (PG) as a text corpus has been extremely popular in statistical analysis of language for more than 25 years. However, in contrast to other major linguistic datasets of similar importance, no consensual full version of PG exists to date. In fact, most PG studies so far either consider only a small number of manually selected books, leading to potential biased subsets, or employ vastly different pre-processing strategies (often specified in insufficient details), raising concerns regarding the reproducibility of published results. In order to address these shortcomings, here we present the Standardized Project Gutenberg Corpus (SPGC), an open science approach to a curated version of the complete PG data containing more than 50,000 books and more than 3×109 word-tokens. Using different sources of annotated metadata, we not only provide a broad characterization of the content of PG, but also show different examples highlighting the potential of SPGC for investigating language variability across time, subjects, and authors. We publish our methodology in detail, the code to download and process the data, as well as the obtained corpus itself on 3 different levels of granularity (raw text, timeseries of word tokens, and counts of words). In this way, we provide a reproducible, pre-processed, full-size version of Project Gutenberg as a new scientific resource for corpus linguistics, natural language processing, and information retrieval.

Read the rest of this entry »

Comments (1)

Robot calligraphy

People's Daily video posted on illegal Twitter:

 

Read the rest of this entry »

Comments (17)

Beneath modern Melbourne lie(s) clues

Bob Ladd sent in a screenshot from the Guardian, with the message:

I think this suggests that, except with auxiliary verbs, subject-verb inversion is not really something that is fully a part of English speakers' competence any more. The agreement discrepancy of "clues" and "lies" would be instantly detectable in most other contexts, but not when it's required by residual English verb-second constraints.

He notes that the screen shot came "from first thing this morning UTC, but it was still up and uncorrected at mid-afternoon UTC". And he suggests that things would be very different with a copula or auxiliary verb, e.g. "Beneath modern Melbourne is two of the richest hoards of pirate gold ever found".

Read the rest of this entry »

Comments (8)

The semiotics of an East Asian hand gesture

These days people make all sorts of public and private hand gestures to convey a wide variety of information.  Innocent though they may seem, for various reasons many of them become controversial (e.g., the sign for "OK", which has recently been classified by some organizations as a symbol of hate).

These students at a university in China are making the sign of bǐxīn 比心.  According to their professor, it means "love you" or "give you my heart".

Read the rest of this entry »

Comments (7)

So

When I was skimming the transcript of the 12/19 Democratic presidential debate for "Warren vocal stereotypes", I noticed that several of the candidates started some of their answers to questions with "so". Among the dozen examples:

WOODRUFF: Senator Warren, why do you think — why do you think more Americans don't agree that this is the right thing to do? And what more can you say?
WARREN: So I see this as a constitutional moment.

WOODRUFF: Brief answers — brief responses from Mr. Steyer and Mr. Buttigieg.
STEYER: So let me say that I agree with Senator Warren in much of what she says.

WOODRUFF: Welcome back to the PBS NewsHour Democratic debate with Politico. And now it's time for closing statements. Each have 60 seconds, beginning with Mr. Steyer. […] Mayor Buttigieg?
BUTTIGIEG: So the nominee is going to have to do two things: defeat Donald Trump and unite the country as president.

Read the rest of this entry »

Comments (27)

Semiotic lesson of the week


Read the rest of this entry »

Comments (5)

Badge of honor: Language Log is blocked in China

Two days ago, I received this message from a colleague in China:

Not sure if this should be a badge of honor or a disappointment, but a few days ago Language Log got blocked in China.  (Source — GreatFire.org:  Language Log is 100% censored)

This caps off a miserable year where we also lost Wikipedia (all languages), The Guardian, Al Jazeera, Hackernews, Imgur….

[VHM:  Of course, Google, Facebook, Twitter, YouTube, and many other invaluable websites were already off-limits to Chinese citizens for years  The internet in China is severely decimated by the CCP government.]

Read the rest of this entry »

Comments (6)