The ideology of legal corpus linguistics

« previous post | next post »

Jonathan Weinberg sent in a link to this article — Molly Redden, "How A Luxury Trip For Trump Judges Doomed The Federal Mask Mandate", Huffington Post 6/3/2024:

Buried in the April 2022 ruling that struck down the Biden administration’s mask mandate was a section that was unusual for a court decision.

The outcome itself was far from surprising. Places all over the country were dropping local mask requirements, and the judge hearing this case — a challenge to the federal mandate to mask on planes and other public transportation — was a conservative Trump appointee, U.S. District Judge Kathryn Kimball Mizelle for the Middle District of Florida. Mizelle ruled that the Centers for Disease Control and Prevention’s mask requirement overstepped the agency’s legal authority.

What was eye-catching was her explanation of why. In her ruling, Mizelle wrote she had consulted the Corpus of Historical American English, an academic search engine that returns examples of how words and phrases are used in select historical texts. Mizelle searched “sanitation,” a crucial word in the 1944 statute that authorizes the CDC to issue disease-prevention rules, and found it generally was used to describe the act of making something clean. “Wearing a mask,” she wrote, “cleans nothing.”

Searching large linguistic databases is a relatively new approach to judicial analysis called legal corpus linguistics. Although it has gained in popularity over the last decade, it is barely discussed outside of an enthusiastic group of right-wing conservative legal scholars. Which raises the question: How did this niche concept wind up driving such a consequential decision in the country’s health policy?

I've been involved in "corpus linguistics" for more than 50 years — including founding the Linguistic Data Consortium in 1992, and promoting applications in  legal arguments along with many other areas.  In the cited Huffington Post article, Molly Redden goes on to highlight a connection of the legal applications to socio-political ideology:

Now, new disclosures seen by HuffPost shed some light. Just weeks before she issued the ruling, Mizelle had discreetly attended an all-expenses-paid luxury trip from a conservative group whose primary mission is to persuade more federal judges to adopt the use of corpus linguistics. For five days, Mizelle and more than a dozen other federal judges listened to the leading proponents of corpus linguistics in the comfort of The Greenbrier, an ostentatious resort spread out over 11,000 acres of West Virginia hillside.

The newly formed group that picked up the tab, the Judicial Education Institute, received more than $1 million in startup funding from the billionaire libertarian Charles Koch’s network and DonorsTrust, a nonprofit that has funneled millions in anonymous donations to right-wing causes and has been dubbed “the dark money ATM of the conservative movement.”

There's a logical connection between corpus-based analysis and "originalist" and "textualist" theories of legal interpretation, which do tend to be preferred on the right end of the political spectrum. But the many relevant LLOG posts over the decades are not clearly identified with a Kochian perspective:

"The right to keep and bear adjuncts", 12/17/2007
"What did it mean to 'bear arms' in 1791?", 6/18/2008
"Corpus linguistics in a legal opinion", 7/20/2011
"Corpus linguistics in statutory interpretation", 3/3/2012
"An empirical path to plain legal meaning", 3/3/2012
"Corpus-based judicial opinions", 7/2/2016
"The BYU Law corpora (updated)", 5/6/2018
"The coming corpus-based reexamination of the Second Amendment", 5/28/2018
"Corpora and the Second Amendment: 'arms'", 2/20/2019
"Corpora and the Second Amendment: Responding to Weisberg on the meaning of 'bear arms'", 5/29/2018
"Corpora and the Second Amendment: Weisberg responds to me; plus update re OED", 6/2/2018
"Corpora and the Second Amendment: Preliminaries and caveats", 6/4/2018
"Corpora and the Second Amendment: Heller", 6/10/2018
"Corpora and the Second Amendment: 'keep' (part 1)", 8/9/2018
"Law & Corpus Linguistics Conference", 8/18/2018
"Corpora and the Second Amendment: 'keep' (part 2)", 10/21/2018
"Corpora and the Second Amendment: 'bear'", 12/16/2018
"Corpora and the Second Amendment: 'arms'", 2/20/2019
"Corpora and the Second Amendment: 'bear arms' (part 1), plus a look at 'the right of the people'", 4/29/2019
"Corpora and the Second Amendment: 'bear arms' (part 2)", 4/30/2019
"Corpora and the Second Amendment: 'bear arms' (part 3)", 7/10/2019
"Corpora and the Second Amendment: 'the right (of the people) to … bear arms'", 7/16/2019
"Corpora and the Second Amendment: 'keep and bear arms' (part 1)", 7/29/2019
"Corpora and the Second Amendment: 'keep and bear arms' (part 2)", 8/23/2019
"The linguistics of the 2nd amendment", 6/1/2022

Update — searching Google Books for {water sanitation filter} in the time period 1930-1944 suggests that filtering water as a form of sanitation was a standard concept in that time period — and as Ethan observes in the comments, this is consistent with the idea of a mask as providing sanitation by filtering air in and breath out…



7 Comments »

  1. J.W. Brewer said,

    June 4, 2024 @ 1:07 pm

    In former times a decision like this might have been justified solely by quoting some dictionary definitions of sanitation from dictionaries published prior to 1944 and saying that mask-wearing obligations didn't obviously fit within their scope. But if done well, corpus linguistics work that looks at how the word was actually used in actual sentences and phrases at some relevant prior historical period is undoubtedly an improvement over dictionary-quoting. Here, FWIW, Judge Mizelle did quote definitions from a 1942 Webster's, a 1946 Funk & Wagnalls, and a 1951 edition of "The Simplified Medical Dictionary for Lawyers" before getting to corpus linguistics. In fact, before getting to corpus linguistics, she acknowledged that "sanitation" has multiple senses and as used in this statute could, in the abstract, "have referred [either] to active measures to cleanse something or to preserve the cleanliness of something. While the latter definition would appear to cover the Mask Mandate, the former definition would preclude it. Accordingly, the Court must determine which of the two senses is the best reading of the statute." The corpus data (507 hits in COHA from 1930 to 1944) was in her judgment consistent with the notion that the former "active measures" sense was the one Congress had used here, although there are other things that she discusses about the context of the word's use in the statute etc. that she says point to the same conclusion.

    Without recreating the COHA search, I can't comment on the accuracy of the description of its results, but it doesn't look suspicious at first glance. One fair general concern might be (as can always be a concern when an argument is the result of a judge's own freelance research rather than having been presented by one of the parties) that one cannot rule out the possibility that if a judge in a situation like this had done a freelance COHA search and found the results affirmatively contradicted whatever conclusion they were trying to support, they would simply not mention the search in the opinion rather than changing the conclusion – the "file drawer effect," as it's called elsewhere.

    It is to be regretted (as a taxpayer, among other things) that the Department of Justice's filings in high-profile appeals are not all posted on some free and easy to locate website, but they aren't, so I don't know how much they argued about that point if they did. I would tend to assume they didn't complain about her handling of corpus data but probably focused instead on arguments about reading an arguably ambiguous word in the broadest available sense to give the agency more flexibility in dealing with emergencies, and leaned heavily into noting that Judge Mizelle did not dispute that there was *a* sense of "sanitation" extant in American English in 1944 that would encompass the "Mask Mandate." But that's just a guess.

    We don't know what the appellate court made of the substance of Judge Mizelle's decision, since the mandate expired (because the relevant official federal emergency declaration expired) while the appeal was pending and the appellate court (at the request of the government and over the opposition of the challengers who had prevailed in front of Judge Mizelle) then dismissed the appeal as moot without opining on who was right and who was wrong.

    The lower court decision is available in various spots on the internet, including here: https://casetext.com/case/health-freedom-def-fund-v-biden-1

  2. Ethan said,

    June 4, 2024 @ 3:39 pm

    I question the solidity of the judge's quoted statement that "Wearing a mask cleans nothing" as a foundation for the ruling. The obvious counter claim is that the function of a mask is to clean the air entering a healthy wearer's lungs by removing impurities, or conversely to clean the air exhaled into the immediate surroundings of an infectious wearer. Did the judge follow this up with an analysis of what it means to clean something, and whether air is "something"?

  3. Jon W said,

    June 4, 2024 @ 5:46 pm

    @JW Brewer: Looking at the district-court filings, neither of the parties referenced corpus linguistics. On the sanitation issue, DOJ cited dictionary definitions including the following one, in a 1946 Funk & Wagnalls dictionary: "[t]he devising and applying of measures for preserving and promoting public health; the removal or neutralization of elements injurious to health; the practical application of sanitary science.” Plaintiffs, for their part, referenced no historical sources at all beyond a 1951 World Health Organization reg.

    Judge Mizelle wrenched apart the separate clauses in the Funk & Wagnalls definition and announced that they really referenced two separate and inconsistent definitions: [1] cleaning things (which she said corresponded to the second clause above), and [2] keeping things clean (which she said corresponded to the first clause). She stated that she had to choose one, and turned to corpus linguistics to announce that instances of “sanitation” in the corpus more often referred to cleaning things than to preserving their cleanliness. But all of this was based on a false premise. There’s no basis in any source for a conclusion that Congress in 1946 meant, in using the word “sanitation," the "removal or neutralization of elements injurious to health” to the exclusion of the "devising and applying of measures for preserving and promoting public health” or "the practical application of sanitary science.” More likely Congress intended the entire continuum of meanings. And to their extent Congress meant only one of those three, Judge Mizelle did not approach the corpus in a way designed to figure out which it was.

  4. Jon W said,

    June 4, 2024 @ 5:49 pm

    And yes, as Ethan indicates, it's hard to think of a better example of "the . . . neutralization of elements injurious to health" than wearing a mask.

  5. J.W. Brewer said,

    June 4, 2024 @ 7:55 pm

    I would agree that if the question is which of two senses of a word is meant in a statute "the one that's more common in a corpus with hundreds of examples in varied and somewhat random contexts not all directly related to the statute" is not particularly strong evidence when compared to interpretive approaches that focus more on the specific context of how the word is used in the statute – since it's always plausible that a less-common (but admittedly extant and not totally obscure) sense is the one that makes more a lot more sense in a specific context. The opinion does have other structural/contextual arguments like that, which particular readers may or may not find persuasive but which involve pretty stock arguments you would expect to see in disputes like this, e.g. "the statute gives X, Y, and Z in a list, so an interpretation of X so broad-scope as to encompass Y and Z and thus make them mere surplusage is to be disfavored." I don't know if phrasing it as two entirely separate senses as opposed to a broader-scope sense and a narrower-scope sense was the most helpful framing. To use the Funk & Wagnalls options, the broader-scope "devising and applying of measures" one would seem to include or subsume the narrower-scope "removal or neutralization" one.

  6. Reblog: How corpus linguistics is being used in the legal system – The Linguistic Detective said,

    June 5, 2024 @ 3:09 am

    […] The ideology of legal corpus linguistics from the Language Log blog. […]

  7. AG said,

    June 6, 2024 @ 5:54 pm

    So "mens sana in corpore sano" means "clean mind in clean body"? this while thing seems beyond idiotic to me. Individual words can have a spectum of meanings at any one time, and these shadings can depend entirely on context, the age and cultural background of the speaker, etc. People also constantly use words incorrectly, especially if they're trying to use fancy "legalese". Again, context and intention would be more important in such cases.

    Finding "the other meaning" was common usage in a certain calendar year might be illuminating or, as in this case, might be next to meaningless.

RSS feed for comments on this post · TrackBack URI

Leave a Comment