Chinese characters formed from letters of the alphabet

Tim Cousins sent in this photograph of a sign in a local mall in Dalian, northeast China.


Read the rest of this entry »

Comments (15)


Geoffrey Leech, 1936-2014

Geoffrey Leech, one of the giants of corpus-based computational linguistics, passed away yesterday. With the death of Chuck Fillmore in February, the field has lost two of its pillars this year.

Read the rest of this entry »

Comments (4)


Lorem China

Brian Krebs, "Lorem Ipsum: Of Good & Evil, Google & China", Krebs on Security 8/14/2014:

Imagine discovering a secret language spoken only online by a knowledgeable and learned few. Over a period of weeks, as you begin to tease out the meaning of this curious tongue and ponder its purpose, the language appears to shift in subtle but fantastic ways, remaking itself daily before your eyes. And just when you are poised to share your findings with the rest of the world, the entire thing vanishes.

Read the rest of this entry »

Comments (31)


Cantonese and Mandarin interwoven

Tom Mazanec noticed this ad for China Mobile by the baggage claim at the Guangzhou (Canton) Baiyun Airport a few nights ago:


Read the rest of this entry »

Comments (6)


ER and ERM in the spoken BNC

From John Coleman:

Inspired by your recent Language Log pieces, I tried an analysis of "er" vs "erm" in the Spoken BNC. These are the two main transcriptions for filled pauses labelled as "UNC" in the Claws-5 tagset and also "UNC" in the richer set of pos labels used in BNC. I.e. they are distinguished from items labelled as ITJ / INTERJ, in which the few tokens of "uh" and "um" are classified. These "uh"s are almost all in "uh huh" meaning "yes", and many of the "um"s and "mm"s are also in contexts where the "yes" sense is clear. So I disregarded the ITJs and restricted the analysis to UNC "er" and "erm", which are far more numerous in any case. As these are mostly nonrhotic dialects one can interpret "erm" as just schwa + nasality, with no implication of rhoticity; ditto for "er".

Read the rest of this entry »

Comments (25)


Biscriptal juxtaposition in Chinese

We have often seen how the Roman alphabet is creeping into Chinese writing, both for expressing English words and morphemes that have been borrowed into Chinese, but also increasingly for writing Mandarin and other varieties of Chinese in Pinyin (spelling).  Here are just a few earlier Language Log posts dealing with this phenomenon:

"A New Morpheme in Mandarin" (4/26/11)

"Zhao C: a Man Who Lost His Name" (2/27/09)

"Creeping Romanization in Chinese" (8/30/12)

Now an even more intricate application of alphabetic usage is developing in internet writing, namely, the juxtaposition and intertwining of simultaneous phrases with contrasting meaning.  Here are a couple of examples:

Read the rest of this entry »

Comments (5)


Filled pauses in Glasgow

In previous posts about filled pauses, we've seen a consistent and large sex difference: women use (what's transcribed as) "um" somewhat more than men do, and men use (what's transcribed as) "uh" a lot more than women do.  This pattern has been found in two large conversational telephone speech corpora involving a mix of ages and American regions, in a collection of undergraduate speed-dating transcripts, in a collection of undergraduate "tell me about your weekend" interviews, and in a collection of several hundred sociolinguistic interviews collected over a period of four decades in Philadelphia.

There are apparently also effects of age, of region, of time period, of years of education, of Autism diagnosis, and so on. Today I'll add one more geographical data point – young adults from the Glasgow area — and one more variable — friends vs. strangers.

Read the rest of this entry »

Comments (16)


Reading the Quran

The following photograph appears in this BBC article: "Why is Sanskrit so controversial?"

It is accompanied by this caption: "Muslims in India choose to learn Arabic".

Read the rest of this entry »

Comments (30)


Burial Man: new hero?

Label on a display at the Nagoya City Museum:


Read the rest of this entry »

Comments (6)


Japanese English trifecta: At the ¥100 Shop

Nathan Hopson reports that he "had a delightfully giggly trip to the ¥100 Shop today."

Among the gems were these three:

1. Pair Bloom (broom), a mini-broom and dustpan set
2. Crash Cashew Nuts (crushed)
3. Q-ban, my favorite. This was actually a whole product line. The shared distinguishing feature of all is their suction cup (吸盤 or きゅうばん [ kyūban]). I guess the only surprise is that they're not called Cubans.

Read the rest of this entry »

Comments


The Latinometer

From David Frauenfelder:

Here’s an item from the land of language: the "Latinometer".

Have you seen it? You enter text into the query box, it analyzes how Latinate your English vocabulary is, and then tells you whether you sound “concrete,” educated, pretentious, or mendacious. The more Latin-derived terms in your text, the more likely you are to be a liar.

Your most recent Language Log post scored 53% on the Latinometer, pretentious, and dangerously close to the “You are probably lying” zone.   I still don’t know if the author, a Latin professor, is trying to be ironic.

Somebody needs to do a LL post on this. I find it utterly ridiculous, and I’m a Latin teacher. Or maybe I find it ridiculous because I’m a Latin teacher. I wonder what a linguist would say?

Read the rest of this entry »

Comments (39)


UM / UH: Life-cycle effects vs. language change

In English-language conversations, older people tend to use UH more often and UM less often. And at every age, men tend to use UH more than women, and women tend to use UM more than men.  These effects are large and robust – they've been documented in at least five independent datasets, from both North American and Great Britain – for details, see the links at the end of this post.

The cited patterns are consistent with two quite different classes of explanation:

  • There might be a language change in progress, with older people reflecting the patterns of an earlier time and younger people showing the language of the future, while women are leading the change, as they often do.
  • There might be stable gender and life-cycle effects, so that the UM and UH sex and age associations looked the same a few decades in the past, and will look the same a few decades in the future.

And there's an independent question about the functions of the classes of vocalizations that we transcribe as UM and UH:

  • Perhaps UM and UH are simply alternative expressions of the same compositional or communicative function – say, two different (classes of) ways of stalling for time in the process of speaking — or alternatively
  • perhaps UM and UH have partly or entirely different functions, and it's differences in the frequency of these functions that are associated with age, sex, and so on.

In neither case are the alternatives mutually exclusive — the truth might be some mixture of the two.

Yesterday, Joe Fruehwald looked at UM and UH usage in a dataset with enough time depth that we can tell the difference between a change in progress and a stable life-cycle effect. And he found that the truth seems to be a bit of both.

Read the rest of this entry »

Comments (2)


Chineasy2

I was hoping that, after writing "Chineasy? Not", I wouldn't have to concern myself with this pedagogical bugaboo again.  Wishful thinking!  For reasons that escape me, the Chineasy juggernaut continues to rumble forward

Read the rest of this entry »

Comments (16)


THE

Email yesterday from Bill Benzon:

Here's a blog post about a little bit of linguistic detail in a VERY interesting book: Matthew Jockers, Macroanalysis: Digital Methods & Literary History.

Do you have any thoughts on that detail?

The post in question is "Reading Macroanalysis 4: On the matter of 'the'", New Savanna 8/13/2014, and the "detail" in question is a cited difference in the frequency of the word the  between a collection of of 19th century British novels and a comparable collection of 19th-century American novels:

Chapter 7, “Nationality” is pretty straightforward. I don’t have much to say about it except for a puzzle that Jockers presents at the beginning. He points out that, because British and American writers have different practices concerning the word the, that word is about 5 percent of the word tokens in his corpus of 19th Century British novels, while it is about 6 percent of the tokens in the American novels.

Read the rest of this entry »

Comments (11)


Sanskrit resurgent

When I was studying Buddhism at the University of Washington (Seattle) in 1967-68, there were about ten students in my first-year Sanskrit course for Buddhologists and Indologists.  What intrigued me greatly was that there was another beginning Sanskrit course being offered at the same time.  It had many more students than the class I was in and was offered by the Linguistics Department.  The rationale for encouraging (I can't remember if it was actually required) linguistics students to take Sanskrit was that the foundations of the scientific study of language had been laid by Panini, Patanjali, and other ancient Sanskrit grammarians around two and a half millennia ago, so that it would be good to have at least a basic understanding of the roots of the tradition.

Read the rest of this entry »

Comments (33)