Obituary: Petr Sgall (1926-2019)

Professor Emeritus Petr Sgall, professor of Indo-European, Czech studies, and general linguistics at Charles University in Prague, passed away on May 28, 2019 in Prague, the day after his 93rd birthday.

Over a lifetime of distinguished work in theoretical, mathematical and computational linguistics, he did more than any other single person to keep the Prague School linguistic tradition alive and dynamically flourishing. He was the founder of mathematical and computational linguistics in the Czech Republic, and the principal developer of the Praguian theory of Functional Generative Description as a framework for the formal description of language, which has been applied primarily to Czech, but also to English and in typological studies of a range of languages.

Read the rest of this entry »

Comments (5)


Tiananmen protest slogan grammar puzzle

Activists gathered at Tiananmen Square on May 14th, 1989:

Source:  "China’s Great Firewall threatens to erase memories of Tiananmen:  VPN crackdown and sophisticated censorship make it harder to access outside information", by Karen Chiu, abacus (6/3/19)

Read the rest of this entry »

Comments (15)


Qualifying fluency

The current xkcd:

Mouseover title: "[20 minutes later] ", hi.""

Read the rest of this entry »

Comments (9)


Chinese language jokes

These are jokes circulating on the Chinese internet.  Not all of them have to do with Chinese languages per se in the narrowest sense.

Mandarin

Guānhuà 官話 (lit., "officials' talk", "Mandarin")

Read the rest of this entry »

Comments (13)


Cat names from GPT-2

Janelle Shane, "Once again, a neural net tries to name cats", 6/3/2019:

Last year I trained a neural net to generate new names for kittens, by giving it a list of over 8,000 existing cat names to imitate. Starting from scratch, with zero knowledge of English or any context for the words and letter combinations it was trying out, it tried to predict what letters might be found in cat names, and in which order. Its names ranged from the strange to the completely nonsensical to the highly unfortunate (Retchion, Hurler, and Trickles were some of its suggestions). Without knowledge of English beyond its list of cat names, it didn’t know what letter combinations to avoid.

So I decided to revisit the cat-naming problem, this time using a neural net that had a lot more context. GPT-2, trained by OpenAI on a huge chunk of the internet, knows which words and letter combinations tend to be used together on the English-language internet. It also has (mostly) figured out which words and letter combinations to avoid, at least in some contexts (though it does tend to suddenly switch contexts, and then, yikes).

Read the whole thing — with pictures! Apparently the Morris Animal Refuge is using this algorithm to name the animals it offers for adoption.

Read the rest of this entry »

Comments (2)


Rating and judging non-native English

From Martijn Wieling:

We have created a questionnaire about rating English accents and judging English audio samples from non-native speakers of English. We'd like to get as many native English speakers as possible to provide their judgements about the audio samples and I was hoping you'd be willing to link the questionnaire.

Note that the survey link randomly redirects people  to one of two questionnaires. One is about deciding which English word you hear (pronounced by a Dutch speaker), the other about rating the nativelikeness of English accents, similar to the questionnaire that you recruited subjects for in 2012 ("Rating American English Accents").

So all you native English speakers, please volunteer — the task just takes a couple of minutes: http://www.martijnwieling.nl/survey

 

Comments (14)


Annals of anthropomorphism

Wired newsletter 6/3/2019:

Read the rest of this entry »

Comments (25)


The traits of a troll

Troll watch

WATCHWORD:  When one goes fishing for trolls, the trolls are almost always certain to bite.

We've recently had a succession of posts on trolls (see "Readings" below).  We all know that there are lots of trolls lurking out there all over the internet, and they are up to no good.  They cause much mischief and disrupt otherwise interesting, productive discussions.  They are especially destructive when they are the first to jump in after a post goes up and reflexively say something nasty and negative, without really having read the post and thought about what it's trying to communicate.  Yet, it is clear that different people have different ideas about what exactly a troll is.  So let us see if we can get some idea of or consensus on what constitutes trollishness in today's world.

Read the rest of this entry »

Comments (18)


Self-aware autoreply

Comments off


One law to rule them all?

Power-law distributions seem to be everywhere, and not just in word-counts and whale whistles. Most people know that Vilfredo Pareto  found them in the distribution of wealth, two or three decades before Udny Yule showed that stochastic processes like those in evolution lead to such distributions, and George Kingsley Zipf found his eponymous law in word frequencies. Since then, power-law distributions have been found all over the place — Wikipedia lists

… the sizes of craters on the moon and of solar flares, the foraging pattern of various species, the sizes of activity patterns of neuronal populations, the frequencies of words in most languages, frequencies of family names, the species richness in clades of organisms, the sizes of power outages, criminal charges per convict, volcanic eruptions, human judgements of stimulus intensity …

My personal favorite is the noises it makes when you crumple something up, as discussed by Eric Kramer and Alexander Lobkovsky, "Universal Power Law in the Noise from a Crumpled Elastic Sheet", 1995 ) referenced in "Zipf and the general theory of wrinkling", 11/15/2003).

Contradicting the Central Limit Theorem's implications for what is "normal", power law distributions seem to be everywhere you look.

Or maybe not?

Read the rest of this entry »

Comments (4)


An anecdote on the limitations of the Chinese writing system

[This is a guest post by Ari-Joonas Pitkänen]

I’m a frequent reader of Language Log, and I’ve been particularly interested in the debate about the usefulness / limitations of the Chinese script in modern society. As the 30th anniversary of the Tiananmen Square massacre approaches, I remembered an anecdote about the limitations of Chinese characters presented in Louisa Lim’s book The People’s Republic of Amnesia. It describes the way jailed activists communicated in prison after the crackdown on Tiananmen in 1989:

Read the rest of this entry »

Comments (9)


Variant pronunciations of "posthumous"

Nick Kaldis asks about the pronunciation of "posthumous":

On NPR this morning, and once a few weeks ago, both announcers pronounced it "pōst-hyooməs"; I can't recall ever hearing this pronunciation before.

Read the rest of this entry »

Comments (33)


Moby Zipf

Comments (4)