Sayable but not writable

The distinguished Chinese linguist, Y. R. Chao, developed the concept of "sayable Chinese" and wrote a series of books illustrating what he meant by it. Basically, what Chao intended by "sayable Chinese" were texts that could be understood when read aloud.  This may sound like a somewhat ludicrous proposition for most languages where what is written on the page may be easily understood when read aloud slowly and clearly.  For Chinese, this is not the case, especially when texts are riddled with Classical terms, sentences, and whole passages that are divorced from spoken language.  But even parts of supposedly pure Mandarin texts may not be intelligible to someone who hears them read aloud, since the semantic carrying capacity of the morphosyllabic characters is greater than their sounds alone.  That is why, when people tell others their names or when someone is giving a lecture or reading a text, auditors will frequently ask the speaker to write down the intended characters for terms that cannot be understood merely by hearing.  Thus, there are many instances where things are writable in Chinese but not sayable.

Read the rest of this entry »

Comments (46)


The English language's Twitter feed

I have a piece on Fresh Air today, behind the curve as usual, on the discussion that followed the Oxford Dictionary Online's inclusion of twerk, which Ben Zimmer covered in a post a couple of weeks ago ("Getting worked up over 'twerk'"). Actually I don't care much about twerk, whose coolness and credentials Ben defended definitively. But I think it's worth looking at the whole list of new words that appeared on the ODO blog post announcing the quarterly update, headed "Buzzworthy words added to Oxford Dictionaries Online – squee!":

apols, A/W (“autumn/winter”), babymoon, balayage (“a technique for highlighting hair”), bitcoin, blondie (small cake), buzzworthy, BYOD (“bring your own device”), cake pop, chandelier earring, child’s pose (yoga), click and collect, dad dancing, dappy, derp, digital detox, double denim, emoji, fauxhawk, FIL (“father-in-law”), flatform (shoe), FOMO (“Fear Of Missing Out”), food baby (“a protruding stomach caused by eating a large quantity of food”), geek chic, girl crush, grats, guac, hackerspace, Internet of things, jorts, LDR, me time, michelada (“drink made with beer, lime juice…”), MOOC, Nordic noir, omnishambles, pear cider[see comment below], phablet, pixie cut, prep (v. “prepare”), selfie, space tourism, squee, srsly, street food, TL;DR, trolly dash (UK supermarket promotion), twerk, unlike (v.), vom (“vomit”)

I’ve bolded the ones that seem to me to have a chance of being still current by the end of the decade, including a few that have been around for quite a while. Some of this is pure guesswork (if you have inside knowledge about bitcoin, let me know) and others may scrape by, but it's a fair bet that the vast majority are not going to survive your hamster.

Read the rest of this entry »

Comments (38)


Computational linguistics and literary scholarship

Email from Dan Garrette:

I am a Computer Science PhD student at UT-Austin working with Jason Baldridge, but I've recently been collaborating with my colleague Hannah Alpert-Abrams in the Comparative Literature department here at UT.  We've been talking about the intersection of NLP and literary study and we are interested in looking at ways in which researchers can collaborate to do work that is valid scholarship in both fields.

There has been a flurry of writing recently about the relationship between the sciences and the humanities (see: Ted Underwood, Steven Pinker, Ross Douthat's response to Steven Pinker, etc), and a particularly interesting paper at ACL (David Bamman, Brendan O’Connor, & Noah A. Smith, "Learning Latent Personas of Film Characters") that attempts to use modern NLP techniques to answer questions in literary theory.  Unfortunately, much of this discussion has failed to actually understand or recognize the scholarship that is really happening in the humanities, and, instead, seems to assume that people in the sciences are able to simply walk in and provide answers for another field.

We would like to see truly interdisciplinary work that combines contemporary ideas from both fields, and we see the ACL paper as the perfect point of entry for a public conversation about this kind of work. Because Language Log attracts readers from many different disciplines, and because computational linguistics has played an important part of the developing field of 'digital humanities,' we thought it might be a good forum for this conversation.

We have written a short response to the ACL paper which we think might make an interesting Language Log post, and Jason suggested I send it to you to see if you were interested.  We'd be very interested to hear your thoughts and the thoughts of the greater Language Log readership. Perhaps it could even spark a conversation.

Read the rest of this entry »

Comments (30)


Ivan Sag 1949-2013

My friend and Stanford colleague Ivan Sag died on Tuesday, after three years of enduring cancer, with uncommon grace. Back in April, Stanford hosted a workshop on Structure and Evidence in Linguistics in Ivan's honor; the workshop website has not only the program, but also a set of tributes to Ivan and his 40 years in linguistics.

Read the rest of this entry »

Comments off


Reassuring parables

The most recent xkcd:

Mouseover title:

'At least humans are better at quietly amusing ourselves, oblivious to our pending obsolescence' thought the human, as a nearby Dell Inspiron contentedly displayed the same bouncing geometric shape screensaver it had been running for years.

Read the rest of this entry »

Comments (29)


Big ear holes

Poster from the Singapore Crime Prevention Council:

Read the rest of this entry »

Comments (3)


Are Sanskrit and Chinese "congenial languages"?

At an international conference on "Sinologists as Translators in the 17th-19th Centuries:  Archives and Context" organized by the Department of the Languages and Cultures of China and Inner Asia of the School of Oriental and African Studies (SOAS) and the Research Centre for Translation Studies of the Chinese University of Hong Kong (CUHK), held at SOAS from June 19-21, 2013, Wolfgang Behr (Zürich University) delivered a paper entitled "Kingsmill's Shijing Translations into Sanskrit and the Very Idea of 'Congenial Languages'".

Read the rest of this entry »

Comments (12)


Rot and Rot (a really, really rude sex joke)

Comments (48)


Proportion of adjectives and adverbs: Some facts

Adam Okulicz-Kozaryn, "Cluttered writing: adjectives and adverbs in academia", Scientometrics 2013:

[H]ow do we produce readable and clean scientific writing? One of the good elements of style is to avoid adverbs and adjectives (Zinsser 2006). Adjectives and adverbs sprinkle paper with unnecessary clutter. This clutter does not convey information but distracts and has no point especially in academic writing, say, as opposed to literary prose or poetry.

If you've seen my earlier discussion of this paper ("'Clutter' in (writing about) science writing", 8/30/2013), you'll recall that Dr. O-K goes on to count adjectives and adverbs in some word lists from samples of scientific writing. He asserts that "social science" writing uses about 15% more adjectives and adverbs than "natural science" writing — although he doesn't tell us enough about his methods to dispel concerns about several likely sources of artifact — and he concludes by asking "Is there a reason that a social scientist cannot write as clearly as a natural scientist?"

In the interests of science of all kinds, I decided to devote this morning's Breakfast Experiment™ to the relations between text quality and the proportion of adjectives and adverbs. I wrote a python script using NLTK to calculate the proportions of various parts of speech in a document; and then I tried this script out on samples of various sorts of writing. Here's some of what I found.

Read the rest of this entry »

Comments (23)


The ultimate earworms

From Lev Michael at Greater Blogazonia:

I was briefly excited by the title of a recent Language Log post, Earworms and White Bears, thinking it might have something to say about, well, worms that people put in their ears. However, we immediately learn that the earworms in question are simply catchy tunes that get caught in people’s minds.

Read the rest of this entry »

Comments (13)


Keith Chen animated

Jason Merchant sent me a link to this animation of Keith Chen's ideas about tense marking and future-orientation in financial and health behaviors:

Read the rest of this entry »

Comments (29)


English and Mandarin juxtaposed

Comments (23)


Pee straight

Comments (9)