Ross Macdonald: lexical diversity over the lifespan

This post is an initial progress report on some joint work with Mark Liberman. It's part of a larger effort to replicate and extend Xuan Le, Ian Lancashire, Graeme Hirst, & Regina Jokel, "Longitudinal detection of dementia through lexical and syntactic changes in writing: a case study of three British novelists", Literary and Linguistic Computing 2011. Their abstract:

We present a large-scale longitudinal study of lexical and syntactic changes in language in Alzheimer's disease using complete, fully parsed texts and a large number of measures, using as our subjects the British novelists Iris Murdoch (who died with Alzheimer's), Agatha Christie (who was suspected of it), and P.D. James (who has aged healthily). […] Our results support the hypothesis that signs of dementia can be found in diachronic analyses of patients’ writings, and in addition lead to new understanding of the work of the individual authors whom we studied. In particular, we show that it is probable that Agatha Christie indeed suffered from the onset of Alzheimer's while writing her last novels, and that Iris Murdoch exhibited a ‘trough’ of relatively impoverished vocabulary and syntax in her writing in her late 40s and 50s that presaged her later dementia.

We're looking at some additional measures, but most important, we're looking at a larger number of authors, including some others known to have died of Alzheimer's disease as well as those who didn't. In this post, we'll look at a simple measure of lexical diversity, namely the mean number of distinct lemmas (essentially, base forms of words) in 10,000-word spans of text, in a series of novels by Kenneth Millar, writing under the pseudonym of Ross Macdonald, who died of Alzheimer's disease in 1983 at the age of 67. Between 1949 and 1976, he wrote 18 novels featuring the detective Lew Archer:

1949 34  The Moving Target
1950 35  The Drowning Pool
1951 36  How some people die
1952 37  The Ivory Grin
1954 39  Find a victim
1956 41  The barbarous coast
1958 43  The doomsters
1959 44  The Galton case
1961 46  The Wycherly woman
1962 47  The zebra striped hearse
1964 49  The chill
1965 50  The far side of the dollar
1966 51  Black money
1968 53  The instant enemy
1969 54  The goodbye look
1971 56  The underground man
1973 58  Sleeping beauty
1976 61  The blue hammer

And this simple measure does indeed show the same sort of trend across these novels that Le et al. observed in Agatha Christie's works:

It remains to be seen how consistent this pattern is. There are some other authors known to have died of Alzheimer's, such as A.E. van Vogt and E.B. White, as well as many more who didn't, such as Rex Stout and Elmore Leonard. But so far, the basic idea checks out.

Of course there are also many other measures to look at, including some more sophisticated ones than measures of overall lexical diversity. And of greater clinical interest is the question of how such trends are reflected in the speech and writing of a broader population.


  1. Rubrick said,

    January 13, 2018 @ 8:19 pm

    I'm surpised that, as far as I can see, the original paper doesn't include analysis of any control authors (who did not suffer from dementia). Surely to show that these changes are due to AD requires, at a minimum, demonstrating that the trendlines differ from those in non-AD-suffering writers?

    (Possibly I missed something; I'll admit I was merely skimming.)

    [(myl) As they write, "using as our subjects the British novelists Iris Murdoch (who died with Alzheimer's), Agatha Christie (who was suspected of it), and P.D. James (who has aged healthily)."

    But one control subject and one or two clinical subjects is not a very large N. So our goal is to increase N on each side to three or four. Which is still pretty small.]

  2. Y said,

    January 14, 2018 @ 12:37 am

    Have you considered looking at scientific writing, rather than fiction, so as to avoid issues of stylistic development?

  3. Eric TF Bat said,

    January 14, 2018 @ 6:24 am

    Sir Terry Pratchett had already developed his style before he began experiencing symptoms of the early-onset Alzheimers that eventually carried him off. He would probably be an excellent subject for this, because he was so very prolific.

  4. speedwell said,

    January 14, 2018 @ 7:45 am

    I find it interesting that you can still perceive this effect even though the publishers of the books doubtless employed a number of editors and other people engaged in reworking and correcting the copy.

  5. Ed Rorie said,

    January 14, 2018 @ 8:58 am

    Editors can't do anything about vocabulary and very little about syntax, unless they work for dishonest publishers.

  6. JJ Price said,

    January 14, 2018 @ 9:42 am

    There certainly are some very visible consistencies there.
    Thank you for the read, as always.
    JJ Price

  7. Tim Morris said,

    January 14, 2018 @ 11:05 am

    Another factor is the overall development of literary tastes and genre styles in a given era. (This was even truer, I think, of some of the observations on percentage of dialogue in earlier posts.)

    In the case of Murdoch, Macdonald, and James, you have three writers of about the same age; Christie was a generation older, but active during much of the same span. It would be slightly stronger confirmation if the same trends existed in writers born a half-century or more apart. In the case of Christie, Macdonald, and James, you have three who wrote mostly in a single genre, and it would be slightly stronger if the pattern emerged in different genres.

  8. Athel Cornish-Bowden said,

    January 14, 2018 @ 2:26 pm

    Many years ago I read a report of a study of nuns who had entered a convent in their 20s and lived to old age. I have no idea if it stood up to later analysis, but the claim was that study of text that they had written while novices could indicate which ones would become senile in later life. If they wrote complex recursive sentence in their 20s they were likely to escape senility, and vice versa.

    [(myl) See "Writing style and dementia", 12/3/2004; "Nun study update", 8/27/2009; etc.]

  9. Thomas Rees said,

    January 14, 2018 @ 2:47 pm

    Best not mention E.B. White here!

  10. Gordon Van Gelder said,

    January 15, 2018 @ 9:39 am

    You might also look into the writing of Greg O'Brien, author of ON PLUTO.

    I wonder too if you might gain any insight by applying this sort of study to the writings of siblings, such as Margaret Drabble and A. S. Byatt.

  11. Rube said,

    January 18, 2018 @ 3:29 pm

    I would love to see this applied to Rex Stout's work, as suggested. My impression is that the last Nero Wolfe novel, written at least forty years after the first one, shows no particular change in writing style, but it would be interesting to see if this is true.

