The accommodation

« previous post | next post »

Yesterday in phonetics class we were discussing accommodation — the way that people adapt the way they talk depending on who they're talking with — and I noted that broadcast interview programs are a natural source of evidence, since the same host speaks at length with many different guests. Previous posts have looked at accommodation in a couple of features on the Philadelphia-based broadcast interview program Fresh Air ("UM/UH accommodation", 11/24/2015; "Like thanks", 11/26/2015). During yesterday's class, it occurred to me that it would make sense to look at accommodation in the use of the definite article the, since the is one of the commonest words in English, and yet the rates vary surprisingly widely across time, registers, genders, moods, and individuals.

Since I already had lexical histograms for my transcriptions of  13 Fresh Air interviews, this was easy to do:

Of course, it's possible that some latent variable — like choice of topic — was influencing both speakers in a given conversation. But this is still prima facie confirmation for the hypothesis that there's accommodation in rates of the usage.

There have been a few LLOG posts on accommodation, e.g.

"Postural accommodation?", 5/2/2010
"Sexual accommodation", 12/30/2011

… and a larger number of posts on secular changes in the rates:

"SOTU evolution", 1/26/2014
"Decreasing definiteness", 1/8/2015
"Why definiteness is decreasing, part 1",  1/9/2015
"Why definiteness is decreasing, part 2", 1/10/2015
"Why definiteness is decreasing, part 3", 1/18/2015
"Positivity", 12/21/2015
"Normalizing", 12/31/2015
"The case of the disappearing determiners" 1/3/2016
"Dutch DE", 1/4/2016
"The determiner of the turtle is heard in our land", 1/7/2016
"Correlated lexicometrical decay", 1/9/2016
"Style or artefact or both?", 1/12/2016
"Geolexicography", 1/27/2016



  1. Y said,

    March 14, 2017 @ 10:24 am

    I'm curious to know, perhaps through a small random sample, how there's a 2x range between the interviewers of Silverman and Kander (who differ in gender and occupation), or between Ford and King (who don't).

    [(myl) The (conventionally cleaned up and edited) transcripts are on line, so you can explore this for yourself ad libitum.]

  2. David P said,

    March 14, 2017 @ 7:27 pm

    This is a single interviewer? It seems odd that the interviewer range is wider than that of the interviewees. Unless there is some sort of super accommodation going on, this seems more consistent with the possibility that interview topic might be driving the usage, especially for the interviewer.

  3. Yuval said,

    March 14, 2017 @ 9:45 pm

    @David P: hypercorrection?

  4. Guy said,

    March 14, 2017 @ 10:43 pm

    @David P

    Is the wider range statistically significant? For example, if we fit a line, is a line of slope one outside the 95% confidence interval? Also, it looks like the interviewee range is wider on a log-log plot (or expressed as a percent of average), which would probably be the more relevant comparison if we want to see the magnitude of the effect.

    [(myl) I could run various analyses of "statistical significance" — or you could, on the basis of the raw data:

    CarrieBrownstein 2924 84 1662 51
    DanielTorday 4630 167 1258 63
    GloriaSteinem 4069 116 1355 53
    IlleanaDouglas 4529 136 1222 39
    JillSoloway 4327 138 1516 50
    JohnKander 2300 85 1544 85
    LenaDunham 5740 149 1715 49
    RichardFord 1356 49 1861 51
    SarahSilverman 4304 96 1151 28
    StephenKing 4750 201 1172 36
    TanehisiCoates 6287 194 1229 40
    ViencentDevita 3873 151 895 34
    WillieNelson 3318 89 1482 49

    [where the 2nd column is the number of words in the guest's transcript, the 3rd column is the number of instances of the in the guest's transcript, and the 4th and 5th columns are the same things for the host, who is Terry Gross in all cases.]

    But the fact that there are only 13 interviews in this test pretty well guarantees that the estimate of the the-percentage range is not very reliable.]

  5. D.O. said,

    March 14, 2017 @ 11:09 pm

    There is about 25% correlation in the-variation in the Switchboard corpus. Accommodation or latent variables or whatnot.

    [(myl) Since there are quite a few cases where a given speaker is involved in multiple conversations with different other speakers, it would be interesting to analyze the variation across the whole sparse matrix of conversational connections…]

  6. MattF said,

    March 15, 2017 @ 8:59 am

    I wonder if some of the correlation is due to the interviewer repeating what the interviewee is saying (or vice versa).

  7. D.O. said,

    March 15, 2017 @ 11:16 am

    …it would be interesting to analyze the variation across the whole sparse matrix of conversational connections

    That's what I've done, to some extent. For each conversation, I took the the rate for each side, subtracted the average the rate for the speaker throughout the corpus and averaged the products (weighted by the conversation length). Then normalized on variance throughout the corpus.

    The next step would be to build some sort of Bayesian model where each speaker is characterized by her "natural level" of the use and "elasticity", that is propensity to change that natural level toward the frequency of another speaker. But, sorry, I am not doing it.

  8. D.O. said,

    March 15, 2017 @ 1:02 pm

    It just occurred to me that I've misinterpreted the correlation. If it was all about accommodation, the correlation must have been negative (one person raises they rate, another lowers). So that's just some latent variables. Still, it is not clear how to disentangle cleanly the accommodation part.

    Very crude way to do things is to look at how much observed frequency of the for one side of the conversation is different from the average the frequency of that speaker. Than subtract these differences (to eliminate changes in the same direction, for example, from topic or difficulty level) and average them (the sign of difference should be taken opposite to the sign of difference in average frequencies). We would get how many percentage points (on average) a person changes they rate of the to accommodate the other side. Switchboard has this number at 0.16% (total the rate is 3.2%). That means 5% rate variation due to accommodation.

    Unless we seriously believe in dis-accommodation (that is people changing their the frequencies to be more different from the other side), the standard deviation provides the noise level for this estimate and it is about 1.3%, 8 times larger than the effect. Obviously, too crude.

RSS feed for comments on this post