Archive for Language on the internets

Since the beginning of history

I have mentioned chinaSMACK before on Language Log, but have never featured it so directly as in this post.  The reason is that this time there's an interesting language aspect to one of their articles that is hard to pass up.

chinaSMACK specializes in translating trenchant, amazing stories from the vast amount of traffic that flows through China's microblogs and on the internet more generally.  Sometimes they are so bizarre and surreal that my initial reaction upon reading them — after being shocked senseless or laughing myself silly — is to dismiss them as Onionesque.  But that is usually impossible because they are so well documented.  In the present case, there is an initial news report and five stunning photographs.  Because the photographs are so gross and graphic, just downright disgusting, I won't show them directly on Language Log (especially not during the holidays), but readers can go to the link and see them with their own eyes.

"Excrement Tanker Explodes, Covering Everyone in Human Waste"  (12/28/14)

Read the rest of this entry »

Comments (7)

A record-setting pangrammatic window

A few months ago, I posted here (and on Slate's Lexicon Valley blog) about PangramTweets, a bot created by Jesse Sheidlower that combs Twitter for tweets that include all 26 letters of the alphabet. I mentioned that it would be interesting to see if PangramTweets turns up any particularly short "pangrammatic windows," i.e., pangrammatic strings in naturally occurring text. At the time, the shortest known example was 42 letters long, in a passage from Piers Anthony's Cube Route:

"We are all from Xanth," Cube said quickly. "Just visiting Phaze. We just want to find the dragon."

My post inspired Malcolm Rowe, a software engineer at Google, to set about finding short pangrammatic windows in an automated fashion, first on the Project Gutenberg corpus and then on the megacorpus of web pages indexed by Google. (Let's hear it for Google's 20 percent time!) On his blog, Malcolm now reports on his findings, including the discovery of a 36-letter pangrammatic window that appeared in a review of the movie Magnolia on PopMatters:

Further, fractal geometries are replicated on a human level in the production of certain “types” of subjectivity: for example, aging kid quiz show whiz Donnie Smith (William H. Macy) and up and coming kid quiz show whiz Stanley Spector (Jeremy Blackman) are connected (or, perhaps, being cloned) in ways they couldn’t possibly imagine.

Read the rest of this entry »

Comments (14)

Emotional contagion

As usual, xkcd nails it:

Mouseover title: "I mean, it's not like we could just demand to see the code that's governing our lives. What right do we have to poke around in Facebook's private affairs like that?"

Read the rest of this entry »

Comments (12)

Is the Urdu script on the verge of dying?

Hindi-Urdu, also referred to as Hindustani, is the classic case of a digraphia, so much so that there has been a long-standing controversy over whether they are one language or two.  Their colloquial spoken forms are nearly identical, but when written down, the one in the Devanāgarī script, the other in the Nastaʿlīq script, they have a very different look and "feel".

Read the rest of this entry »

Comments (56)

Banned in Beijing

Everyone knows that the Chinese government goes to extraordinary lengths to police the internet (see: "Blocked on Weibo").

And most sentient beings are aware of the awesome fame of the Grass-Mud Horse, the notorious Franco-Croatian Squid, and and the mysterious River Crab.  You can find all of them in "Grass-Mud Horse Lexicon Classics".

Sometimes, the censors begin to look pretty ridiculous, as when they outlawed the word "jasmine" in 2011, particularly since it refers not just to the Jasmine Revolution, but also to a favorite flower, tea, and folk song.

mòlì 茉莉 ("jasmine")

mòlì chá 茉莉茶 ("jasmine tea") OR mòlìhuā chá 茉莉花茶 ("jasmine tea") OR xiāngpiàn 香片 ("scented [usually with jasmine] tea")

mòlìhuā 茉莉花 ("jasmine flower", name of a popular folk song; presidents Jiang Zemin and Hu Jintao were both excessively fond of this song, and there are videos of them singing it, so it becomes especially awkward to try to forbid citizens to use the word mòlì 茉莉 ("jasmine")

Read the rest of this entry »

Comments (28)

Bonfire beneficiaries

Subeditor Humphrey Evans points out to me that the grammar of phishing spam emails is getting worse and worse, rather than better. He recently saw one that contained this text:

The sum of (6.5M Euros only will be transfer into your account after the processing of all relevant legal documents with your name as the bonfire beneficiary, the transfer will be made by Draft or telegraphic Transfer (T/T), conformable in 3 working days as soon as you apply to the bank director.

That "bonfire beneficiary" bit is an eyebrow-raiser, isn't it? It seems to be an error for the Latin phrase bona fide "good faith".

Read the rest of this entry »

Comments off

PangramTweets

The Twitter API, beyond its great utility for corpus linguistics (see "On the front lines of Twitter linguistics," "The he's and she's of Twitter"), has made possible a lot of fun automated text-mining projects. One fertile area is algorithmic found poetry: there have been Twitter bots designed to find accidental haikus, and even more impressively, a bot named @Pentametron that finds rhyming tweets in iambic pentameter and fashions sonnets out of them.

And then there is found wordplay, which is its own kind of found poetry. I'm a big fan of @Anagramatron, which discovers paired tweets that form serendipitous anagrams of each other. (Example: "Last time I do anything" ⇔ "That's it. I'm dying alone.") Now, courtesy of Jesse Sheidlower, comes @PangramTweets, in which each tweet contains every letter of the alphabet at least once.

Read the rest of this entry »

Comments (8)

Cantonese poetry recitation

A recent issue (1/7/14) of the South China Morning Post (SCMP) carried an article by a staff reporter entitled "Hong Kong student's poem recital goes viral in the mainland ". The article features this amazing video of a Hong Kong high school student reciting a couple of Classical Chinese poems:


Read the rest of this entry »

Comments (19)

"People mountain, people sea" and "let's play"

Stephan Stiller says that my post on "Good good study; day day up" reminds him of "people mountain, people sea" (rénshānrénhǎi 人山人海), i.e., "crowded; packed; a sea of people".  This is another fairly complex Chinglishism that has entered the vocabulary of many English speakers who know no Chinese.  It was popularized by a Hong Kong music production company that took this expression as its name, and there was also a Hong Kong film that used this expression as its title.

Read the rest of this entry »

Comments (31)

Tyrant's bling

Arguably the hottest term on the Chinese internet these days is tǔháo 土豪 ("[local] tyrant / despot"), but transformed to mean "bling", and with a sharply satirical edge.  How did tǔháo 土豪 ("[local] tyrant / despot") morph into "bling"?  The story is told in "#BBCtrending: Tuhao and the rise of Chinese bling".

Read the rest of this entry »

Comments (3)

The English language's Twitter feed

I have a piece on Fresh Air today, behind the curve as usual, on the discussion that followed the Oxford Dictionary Online's inclusion of twerk, which Ben Zimmer covered in a post a couple of weeks ago ("Getting worked up over 'twerk'"). Actually I don't care much about twerk, whose coolness and credentials Ben defended definitively. But I think it's worth looking at the whole list of new words that appeared on the ODO blog post announcing the quarterly update, headed "Buzzworthy words added to Oxford Dictionaries Online – squee!":

apols, A/W (“autumn/winter”), babymoon, balayage (“a technique for highlighting hair”), bitcoin, blondie (small cake), buzzworthy, BYOD (“bring your own device”), cake pop, chandelier earring, child’s pose (yoga), click and collect, dad dancing, dappy, derp, digital detox, double denim, emoji, fauxhawk, FIL (“father-in-law”), flatform (shoe), FOMO (“Fear Of Missing Out”), food baby (“a protruding stomach caused by eating a large quantity of food”), geek chic, girl crush, grats, guac, hackerspace, Internet of things, jorts, LDR, me time, michelada (“drink made with beer, lime juice…”), MOOC, Nordic noir, omnishambles, pear cider[see comment below], phablet, pixie cut, prep (v. “prepare”), selfie, space tourism, squee, srsly, street food, TL;DR, trolly dash (UK supermarket promotion), twerk, unlike (v.), vom (“vomit”)

I’ve bolded the ones that seem to me to have a chance of being still current by the end of the decade, including a few that have been around for quite a while. Some of this is pure guesswork (if you have inside knowledge about bitcoin, let me know) and others may scrape by, but it's a fair bet that the vast majority are not going to survive your hamster.

Read the rest of this entry »

Comments (38)

Grass-Mud Horse Lexicon Classics

China Digital Times (CDT) Grass-Mud Horse Lexicon is the premier place to go for Chinese netizen language designed to avoid the censors and to poke fun at the political system.

Over the years, CDT has accumulated 273 entries in its Grass-Mud Horse Lexicon.  From these, the CDT editors have selected 71 essential items for inclusion in The Grass-Mud Horse Lexicon: Classic Netizen Language, which has just been published.

Here's the Kindle edition on Amazon.

Read the rest of this entry »

Comments (1)

Subversion at the spam factory?

So this is new, at least for me — the latest batch of a few thousand spam comments (adding to the pile of 5,095,703 caught so far) pretends to come from people using negatively-evaluated pseudonyms in Spanish, like caca, ladrones, or indecentes:

Read the rest of this entry »

Comments (14)