Linguistic dominance in House of Cards

« previous post | next post »

You may have seen "The Ascent: Political Destiny and the Makings of a First Couple", now featured on the e-front-cover of The Atlantic magazine:

If you click on the link, the top left of the resulting page bears a little tag telling you that you're reading "sponsored content" — and if you mouseover that tag, you'll learn that

This content was created by Atlantic Re:think, The Atlantic's creative marketing group, and made possible by our Sponsor. It does not necessarily reflect the views of The Atlantic's editorial staff.

One piece of that "The Ascent" page, down at the bottom under the heading "Frank and Claire: Patterns of Power", presents a bit of computational psycholinguistics:

We can tell a lot about ourselves by the words we use. But not the big words. The small ones: you, we, I, me, can’t, don’t, won’t. In fact, if we pan back far enough, we can see broader traits, like dominance and submissiveness. Which is exactly what we did by analyzing all of Frank and Claire Underwood’s private dialogue throughout House of Cards Seasons 2-3, using a special language-processing software. The results were fascinating.

This post gives a bit of the background of that segment, including my own small role in its genesis. The main point is to prepare the ground for a discussion of the ideas involved, which I think are interesting and important; but maybe a description of the process will also be interesting.

About a month ago ago, Sam Rosen, the vice president of marketing at The Atlantic magazine, called the Penn linguistics department with an unusual request. He wondered whether some linguist at Penn could do a linguistic analysis of interactions between U.S. presidents and their spouses, in connection with a promotional piece for the up-coming third season of the Netflix series House of Cards.  The department's administrator put the question to the faculty email list, and I responded.

After corresponding with Sam, I had the wit to involve Jamie Pennebaker. Here's what I wrote to him:

Are you still traveling?  

Either way, I hope it was/is entertaining and free of complications.

I have a complication to propose, though I hope it will be an entertaining one.  A few days ago, Sam Rosen from The Atlantic contacted me about providing a linguistic angle on a feature they are planning jointly with Netflix about the dramatic series House of Cards. The original idea was to look at interactions among real-world U.S. presidential couples, as revealed in things like 60 Minutes interviews. I was concerned that there might not be a lot to work with, in terms of direct interactions between the couples, and suggested that we look instead at the scripts and performances in the show itself.  They bought the idea, and even seem enthusiastic about it.

So Sam is sending me all the scripts in .pdf form, and is asking about getting Word files or whatever the pdfs came from, and also is sending me DVDs of the show. It's on me to turn this into a dataset indicating who said what when to whom.  From the beginning, I thought of you as a someone with interesting things to say about the language of  interpersonal interactions. So can I recruit you to participate in this (somewhat vaguely defined) enterprise?

The good news (and also the bad news) is that it'll be over soon — they want to "go live" on March 3.

As I understand it, all they want from me (or us) is an article about (some of) the ways that personality, relationships, sex and power are reflected in conversation, with examples focused on material from House of Cards (or possibly on real-world U.S. presidential couples, if we can find some material). This would be published as an article in The Atlantic, in the context of a larger enterprise of some kind. (Sam, to whom I've copied this note, can tell us more about that larger context, whether this is for print or web or both, whether we can include audio or video clips, etc.)

So what do you say? Will you (and/or maybe one of your students) have a few hours over the next month to devote to this?

Here's Jamie's response:

Mark, you bastard.  

I'm supposed to be taking life easy in Paris.  But I'm a sucker for this kind of project.  Sign me up. I have an insane schedule over the next few weeks but can get much of this done on the road or with the help of my grad student Ryan Boyd.  I feel certain that we can get a tremendous amount of information from both the presidential couples and House of Cards (a series which I love — both the U.S. and Brit versions).  

By the way, I would love to get some kind of interactions among presidents and their spouses. Historically, there are letters between the spouses. But I agree that most programs like 60 minutes would be too scripted.  In fact, in House of Cards, it would be interesting to compare scenes where the two of them are "on stage" versus those where it is just the two talking.

So Ryan Boyd came aboard — and wound up doing most of the work.

In the end, we worked with the scripts for seasons 2 and 3, because the available scripts for season 1 were in a digital format that was harder to hack into analyzable form. Sam got me the DVDs for the shows, and .mp3 copies of the sound tracks; my plan was to used "forced alignment" to line the scripts up with the audio, and add information about speaking rate, pitch range, etc. to the mix. I naively thought that this plan had a chance of success, despite the fact that the month in question involved travel to France, California,  and Washington DC, along with normal teaching and other local activities; but an unexpected tooth infection and subsequent root canal made things harder.

And Jamie and Ryan came through with a great story to tell — based on Ewa Kacewicz, James W. Pennebaker, Matthew Davis, Moongee Jeon, and Arthur C. Graesser, "Pronoun use reflects standings in social hierarchies", Journal of Language and Social Psychology 2013 — which allowed me to give up my plans gracefully.

There were several other dimensions of analysis — public vs. private language, Frank and Claire's language with one another vs. their language with others — but in the end, the most compelling story, as Jamie explained it in email, was this:

[I]t's not based on the nature of the couples "power" status.  It's the nature of their connections with each other.  I think a better way to think about this is if the two are working as a team with shared goals.  Seasons 1 and 2 showed a couple where the two relied on each other to get ahead.  I like to think that normal high functioning couples are somewhat similar.  That is, the success of one spouse is celebrated by the other because that success makes both of their lives better. This goes back to the basketball team.

The Season 3 pattern is something we see in any environment where the success of one person undermines the success of the other.  In couples, think of George and Martha in Who's Afraid of Virginia Wolf?  Assuming that the American season 3 is similar to the British version, the tables are turned where the members of the couple now realize that their own success will depend on the other's failure.  This is a phenomenon that occurs in more dysfunctional relationships.

And the quantitative signature of this change was a textual measure of "dominance" that Jamie explained this way:

How do we see dominance in language?  We have been involved in a series of studies on language associated with social hierarchy.  Overall, a group of function words have consistently emerged in interactions among couples and groups that predict power, dominance, and leadership.  

Words associated with higher status and power include 1st person plural pronouns (e.g., we, us, our) and 2nd person pronouns (you, your). Words associated with lower status include 1st person singular pronouns (I, me, my) and negations (e.g., no, not, can't).

Ryan extracted word counts by by speaker by episode (in seasons 2 and 3 of the U.S. show) for Frank and Claire, and applied this simple formula:

Dominance = 20 + (you + we) – (i + ipron + negate)

The result was the graphs used in the final presentation:

(See the article for why these graphs make sense…)

The most important lesson here, I think, is a familiar one: Word counts are powerful. That's not to say that word order and word-sequence structure don't matter, just that word choice does matter. And a more nuanced version of this lesson, especially associated with Jamie Pennebaker's work over the years, is that even function-word counts are powerful predictors of personality, mood, and attitude.

But I would disagree slightly with the way that the Atlantic Re:think people put it:

We can tell a lot about ourselves by the words we use. But not the big words. The small ones: you, we, I, me, can’t, don’t, won’t.

There's plenty of information in the "big words" as well. But in analyzing individual interactions — real or scripted — we have to cope with the fact that overall word-count totals are low, so that the vector of word frequencies is very sparse.  Words like pronouns and articles are common enough for their counts to be used directly; and Jamie's key contribution has been to demonstrate that those counts carry a lot of information.

But for most words, some sort of dimensionality-reduction is usually needed, if we're going to work with word-count features in individual small "documents" like transcripts of conversational sides. More on this later.



  1. Stefano Bertolo said,

    March 12, 2015 @ 12:32 pm

    can Mark and/or Ryan please explain where the formula

    Dominance = 20 + (you + we) – (i + ipron + negate)

    comes from?

    What reasons are there to think it really does capture dominance?

    Has it been shown to correlate with other expressions of dominance (e.g. gaze) in other domains?

    thanks in advance.

    [(myl) A place to start would be the tables in Ewa Kacewicz, James W. Pennebaker, Matthew Davis, Moongee Jeon, and Arthur C. Graesser, "Pronoun use reflects standings in social hierarchies", Journal of Language and Social Psychology 2013.]

  2. bratschegirl said,

    March 12, 2015 @ 12:58 pm

    "Assuming that the American season 3 is similar to the British version, the tables are turned where the members of the couple now realize that their own success will depend on the other's failure."

    I don't think this is an accurate representation of Season 3/Brit version. Elizabeth realizes that her success depends on Francis NOT failing, not being revealed to have done what he's done (trying mightily to avoid spoilers here). What she does in the end is calculated to save his reputation in the eyes of history, and thereby avoid the personal losses she would otherwise experience. And likewise, in terms of the financial and judicial shenanigans, his success again depends on her success in cultivating certain relationships and planting certain seeds. So I don't buy the above premise.

  3. Brett said,

    March 12, 2015 @ 7:54 pm

    I doubt any meaningful conclusions could be drawn by analogy to the British original anyway. Mrs. Urquhart does not have much screen time in any of the three series. She has a more consequential role in the third series than in the first two, but the show is quite clearly never about Francis Urquhart's relationship with his wife; there is a different female lead for him to interact with in each

  4. maidhc said,

    March 13, 2015 @ 10:29 pm

    When did "software" become a countable noun?

RSS feed for comments on this post