Towards tracking neurocognitive health

A few months ago, I posted about a talk I gave at an Alzheimer's Association workshop on "Digital Biomarkers".

Overall I told a hopeful story, about the prospects for a future in which a few minutes of interaction each month, with an app on a smartphone or tablet, will give effective longitudinal tracking of neurocognitive health. […]

Speech-based tasks have been part of standard neuropsychological test batteries for many decades, because speaking engages many psychological and neurological systems, offering many (sometimes subtle) clues about what might be going wrong.

But I emphasized the fact that we're not there yet, and that some serious research and development problems stand in the way. […]

Some colleagues and I are starting a large-scale project to get speech data of this general kind: picture descriptions, "fluency" tests (e.g. "how many words starting with F can you think of in 60 seconds?"), and so on. The idea is to support research on analysis of such recordings, automated and otherwise, and to allow psychometric norming of both traditional and innovative measures, for both one-time and longitudinal administration, across a diverse population of subjects. We've got IRB approval to publish the recordings, the transcripts, and basic speaker metadata (age, gender, language background, years of education).

We've been testing the (browser-based) app across a variety of devices and users. When it's ready for prime time, this is one of many channels that we'll use to recruit participants — we're hoping for a few tens of thousands of volunteers.

We're finally ready to open this app to wider use, and you can contribute a few socially-distant minutes of your time by going to And please tell your friends!

A warning about the Safari browser:

The audio recording depends on the MediaRecorder API, which works in Chrome, Firefox, and Edge browsers, but doesn't yet work in Safari, at least not without complicated special settings. So don't use Safari, unless you have a recent update and want to work through the extra steps required to unlock the relevant features. (Supposedly MediaRecorder will be standard in Safari 13.1, but that version hasn't been released yet…)

I want to make it clear that this is just a first step down a long road. In order to create reliable tracking methods, we'll need to

  1. Validate automatically administered and scored tests against current clinical standards;
  2. Clarify how much of what kind of testing we need to ensure a replicable result;
  3. Do psychometric norming of different tests and test versions for longitudinal use.

There are a couple of dozen other standard neurocognitive tasks, speech-mediated and otherwise. We've implemented and tested browser-based version of many of them, but we don't want (or need) to overload volunteer subjects at this point. The goal of this first step is to just get a large, demographically varied sample of automatically-collected responses to multiple versions of two standard tests.

This will help us find (and fix) cases where the browser-based collection app fails to work across devices, operating systems, and browsers. The recordings we collect will be used to train and test automatic speech-to-text systems. And our clinical partners will use this same app on an appropriate sample of their patients and controls, so that the results can be compared to results from neuroimaging, blood and other lab tests, genomic tests, and more complete cognitive testing.

Maybe in the end, analysis of data from wearable sensors in everyday life will provide the best diagnostic and tracking methods. But tests like those at should help us towards that goal — and there's sure to be useful science and technology along the way.

  1. Philip Taylor said,

    March 24, 2020 @ 11:28 am

    Tried twice to sign up, once as "Philip Taylor" and once as "Philip.Taylor"; in both cases the attempt was rejected with the diagnostic :

    "The form contains 1 error
    Name is invalid"

    There are no documented constraints on "Username", so what tacit constraint(s) are being enforced ?

    [(myl) I'll look into it — maybe it doesn't like spaces and punctuation (which would be weird, but anyhow…) Meanwhile I was able to create an account with your email address and Name "PhilipTaylor". I'll send you the password separately. Or you could sign up again as "PhilipTaylor1" or whatever.]

  2. Trogluddite said,

    March 24, 2020 @ 3:46 pm

    I find this a fascinating area of research, and I'm certainly happy to volunteer. However, I do have one small linguistic criticism: upon clicking the link for further information I read the following: "…a large and diverse group of *normal* people."

    I certainly understand that there might be very good reasons for excluding subjects who have neurological or mental health conditions from the cohort. However; besides its ambiguity, "normal" may have uncomfortable connotations for such people, and a less loaded phrasing would be more appropriate, I think.

    I don't mean to impute any intent to offend whatsoever, of course. I wish you and your colleagues every success, and I look forward to future posts about what you discover.

    [(myl) Sorry for the misunderstanding — we certainly don't mean to exclude anyone, but rather to emphasize that we're interested in the distribution of measures across a large sample of the population as a whole, not just in people with cognitive disabilities of one kind or another.]

  3. Barbara Phillips Long said,

    March 25, 2020 @ 2:04 pm

    I haven’t been to the site yet, but I hope timed tests of vocabulary aren’t limited to a recitation of F words. Asking for D words, R words, T words or other letters (or randomizing the letter selection) seems better to me, because I believe F word recitations will elicit taboo vocabulary that some speakers will self-censor for. That could slow down the number of words uttered in the time allowed because switching mental gears to conscious filtering probably causes delays in much the same way that conscious piano-playing slows down muscle-memory piano performance. If hesitation plays a part in diagnosing declines in mental acuity, then timed tests of F words are likely problematic.

    [(myl) Standard "fluency" tests (the standard term for this task, apparently) use the letters F, A, S, and "semantic" categories like "animals" and "vegetables". I agree that F-words are an odd choice, but maybe 50 or 60 years ago, when these tasks were developed, the situation was different.]

  4. Michael Watts said,

    March 25, 2020 @ 6:09 pm

    Does this involve providing sound recordings or text? It's not stated anywhere, and they're very different in terms of how burdensome it is to participate.

    [(myl) All of the responses are spoken, and are recorded by the browser, so it's not burdensome unless you find it difficult to talk for a minute or so.]

  5. Philip Taylor said,

    March 26, 2020 @ 4:16 am

    "unless you find it difficult to talk for a minute or so". If only ! Because of the Covid-19 shutdown, North Cornwall Talking Newspapers (mainly for the blind and partially sighted) cannot meet this evening to record this week's edition as we normally would. We have six teams of three readers and one recording engineer. Each week, the members of one team each read one article until all 27 or so articles have been read, while the recording engineers adjusts levels, fades microphones in/out, etc. This evening I shall be all four …

  6. Barbara Phillips Long said,

    March 26, 2020 @ 4:10 pm

    Fluency tests: I wonder why F, A, and S were chosen, but say, B, M, P, or T were not. Did the testers want to avoid using an initial letter that required more muscle movement from lips? I can understand using S over a letter like V, which begins fewer words in English, or a letter like K, which is sometimes silent. On the other hand, lisping is a known problem, and choosing S seems problematic in that respect. S is also a problem for some stutterers.

    Was the fluency test developed by linguists, speech therapists, cognitive therapists, physicians treating stroke or Parkinson’s patients, or some other body or profession? Is the test used for a variety of diagnostic tasks? (That is, was S used to detect lisping and stuttering among other disfluencies?)

    [(myl) See these papers for some background and discussion. The initial-letter version seems to have originated as the "Controlled Oral Word Association Test" (COWAT), documented in Benton, A. L. "Development of a multilingual aphasia battery: Progress and problems." Journal of the neurological sciences 9, no. 1 (1969): 39-48. Arthur Benton was neurolopsychologist in the Department of Neurology at the University of Iowa.]

    As for F, if the test was formulated a half century ago, it is less surprising. In my lifetime taboo words beginning in F have become more common in working vocabularies among people I know and in spoken American English that I’ve observed. My late mother would have been unlikely to free-associate such words or utter them in a fluency test. My behavior is modeled on hers, but my mind works differently.

    Maybe there should be some consideration given to expanding the fluency test to add a couple more letters, making the database wider while not losing the opportunity to have comparable data over time. If the test was devised initially to detect stuttering, for instance, then co-opted for other uses, adding other letters to reflect the widening usage seems practical.

    [(myl) We want to start with tasks that stick fairly closely to what neurologists do now, with the idea of moving into better (more natural, more varied, etc.) methods later on.]

