- Website: http://ling.upenn.edu/~myl
Posts by Mark Liberman:
From Steve Kass:
My brother is traveling in Portugal and posted this on Instagram. That’s all I know.
I'm in Portorož, Slovenia, for LREC2016; and so far the most interesting linguistic aspect of the place is the sometimes-surprising mixture of languages on signs. For example:
The longer explanation of the side of the van is in Slovenian — Restavriranje, brušenje, čiščenje in impregnacije naravnega kamna = "Restoration, grinding, cleaning and impregnation of natural stone". But the short version is in English: STONE SERVICE.
Today's Pearls Before Swine explores the consequences of flapping and voicing in American English:
According to the 2016 Texas Republican Party platform (or more exactly, the "Report of the Permanent Committee on Platform and Resolutions as Amended and Adopted by the 2016 State Convention of the Republican Party of Texas"),
Homosexuality is a chosen behavior […] that has been ordained by God in the Bible, recognized by our nations founders, and shared by the majority of Texans.
Restoring the elided material:
Homosexuality is a chosen behavior that is contrary to the fundamental unchanging truths that has been ordained by God in the Bible, recognized by our nations founders, and shared by the majority of Texans.
Barbara Phillips Long sent in a link to Cari Romm, "Why You Sometimes Mix Up Your Friend’s Name With Your Dog’s Name", New York Magazine 5/19/2016:
Every so often, my mother, in a mental search for my name, will run through what seems like the entire family tree — she’ll say the names of my brother, her sisters, her parents, our family dog, in rapid succession before finally landing on Cari. Most of these names, it may be worth noting, sound nothing alike; also, the dog has been dead for six years.
Romm's article was occasioned by Samantha Deffler et al., "All my children: The roles of semantic category and phonetic similarity in the misnaming of familiar individuals", Memory & Cognition April 2016:
Despite knowing a familiar individual (such as a daughter) well, anecdotal evidence suggests that naming errors can occur among very familiar individuals. Here, we investigate the conditions surrounding these types of errors, or misnamings, in which a person (the misnamer) incorrectly calls a familiar individual (the misnamed) by someone else’s name (the named).
Every year since 2005, an ad hoc group of speech technology researchers has held a "Blizzard Challenge", under the aegis of the Speech Synthesis Special Interest Group (SYNSIG) of the International Speech Communication Association.
The general idea is simple: Competitors take a released speech database, build a synthetic voice from the data and synthesize a prescribed set of test sentences. The sentences from each synthesizer are then evaluated through listening tests.
Anyhow, if you have an hour of your time to donate towards making speech synthesis better, sign up and be a listener!
The Political TV Ad Archive is a project of the Internet Archive. This site provides a searchable, viewable, and shareable online archive of 2016 political TV ads, married with fact-checking and reporting citizens can trust. Political TV ad spending is expected to be in the billions. Yet the same local stations that air the ads provide very little solid reporting on politics. Even fewer correct political misinformation. In partnership with trusted journalistic organizations, the new Political TV Ad Archive provides a free service for journalists, civic organizations, academics and the general public to track these ads in context. The project is open source and available on github: this site and the Duplitron.
For an introduction to the Political TV Ad Archive and how to use it, check out this video.
As of March 23, 2016, the Political TV Ad Archive is wrapping up the first phase of the project, where we tracked 20 markets in nine key primary states. The project will continue to track ads playing in the New York, Philadelphia, and San Francisco television market areas. Project staff are gathering lessons learned, which will inform planning and fundraising for the second phase of the project: tracking political ads in key 2016 general election battleground states.
For various reasons I recently downloaded snapshots of Wikipedia in various languages, and I'd like to share with you some discoveries, starting with article length in the English Wikipedia.
Is this the future of English pronouns? Ada Palmer's Too Like the Lightning takes place in a world where he/she is as quaintly obsolete as thee/thou. From the book's opening:
You will criticize me, reader, for writing in a style six hundred years removed from the events I describe, but you came to me for explanation of those days of transformation which left your world the world it is, and since it was the philosophy of the Eighteenth Century, heavy with optimism and ambition, whose abrupt revival birthed the recent revolution, so it is only in the language of the Enlightenment, rich with opinion and sentiment, that those days can be described. You must forgive me my ‘thee’s and ‘thou’s and ‘he’s and ‘she’s, my lack of modern words and modern objectivity. It will be hard at first, but whether you are my contemporary still awed by the new order, or an historian gazing back at my Twenty-Fifth Century as remotely as I gaze back on the Eighteenth, you will find yourself more fluent in the language of the past than you imagined; we all are.
It's hard to tell with just four speakers to go on, but it looks as if there could be some kind of correlation between the ADV:ADJ ratio and the V:N ratio (as might be expected given that adjectives canonically modify nouns and adverbs canonically modify verbs). Of course, there are all sorts of other factors that could come into this, but to the extent that speakers are choosing between alternatives like "caused prices to increase dramatically" and "caused a dramatic increase in prices," I'd expect some sort of connection between these two ratios.
So since I have a relatively efficient POS tagging script, and an ad hoc collection of texts lying around, I thought I'd devote this morning's Breakfast Experiment™ to checking the idea out.
…is "Tardy Mark", at least according to one roll of the dice by The Daily Show's Trump Nickname Generator:
A puzzling note arrived in my inbox a few days ago:
I came across an article you wrote about the use of adverbs and adjectives. To count the use of adverbs and adjectives you actually wrote a program. Is this something you would be willing to share or give me some advice on how to create myself? I am looking for a tool that our marketing team can use to keep the puffery to a minimum.
It was puzzling because the cited article was "Stop Hating on Adjectives and Adverbs", Slate 9/10/2013. And as the title suggests, my attitude towards eliminating adjective and adverbs was a skeptical one:
Calculating the relative percentages of adjectives and adverbs in texts tells us nothing useful about their readability, clarity, or efficiency.