Archive for Historical linguistics

Corpora and the Second Amendment: 'keep' (part 1)

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

With this post, I begin my examination of the corpus data regarding the phrase keep and bear arms. My plan is to start at the level of the individual words, keep, bear, and arms, then proceed to the simple verb phrases keep arms and bear arms, and finally deal with the whole phrase keep and bear arms. I start in this post and the next one with keep.

As you may recall from my last post about the Second Amendment, Justice Scalia's majority opinion in D.C. v. Heller had this to say about the meaning of keep: "[Samuel] Johnson defined 'keep' as, most relevantly, '[t]o retain; not to lose,' and '[t]o have in custody.' Webster defined it as '[t]o hold; to retain in one's power or possession.'" While those definitions could be improved on, I think that for purposes of this discussion, they adequately explain what keep means when it's used in the phrase keep arms. So I'm not going to discuss that data with an eye to criticizing this portion of the Heller opinion.

Instead, I'm going to use the data for keep as the raw material for an introduction to the nuts and bolts of corpus analysis. I suspect that many people reading this won't have had any first-hand experience working with corpus data, or even any exposure to it. Hopefully this quick introduction will enable those people to better understand what I'm talking about when I start to deal with the data that does raise questions about the Supreme Court's analysis.

Read the rest of this entry »

Comments (14)

Corpora and the Second Amendment: Weisberg responds to me; plus update re OED

[An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here.]

Two quick updates.

First, David Weisberg has replied to my response to his post on the Originalism Blog, but he doesn't address the point that I made, which was that I disagreed with his framing of the issue.

Weisberg also notes that I didn't respond to the second point in his original post (which dealt with a purely legal issue), and he goes on to say this:

Many people (and I think Goldfarb is one) believe the correct sense of the 2nd Amend is this: “The right of the people to keep and bear Arms, for use in a State’s well regulated Militia, shall not be infringed.” But, if that is what the framers meant, why isn’t that what they wrote? I think that is a very fair question to ask, and it merits an answer. After all, 5 words would have been saved. Will corpus linguistics provide an answer?

I'm not going to offer any views in this series of posts about how I think the Second Amendment as a whole should be interpreted; I'm focusing only on Heller's interpretation of the phrase keep and bear arms. So I'm not going to say whether Weisberg is correct in his speculation about what I think on that score. Weisberg then asks why, if the framers had intended to convey the meaning he posits, they didn't write the amendment in those terms. Although Weisberg thinks that is "a very fair question to ask," I don't think it's a question that's relevant to the issue as the Court framed it in Heller, which had to do with how the Second Amendment's text was likely to have been understood by members of the public, not with what the framers intended. Nevertheless, I'll say that the question to which Weisberg wants an answer is not one that can be answered by corpus linguistics.

Read the rest of this entry »

Comments off

Dennis Baron (in WaPo) on corpus linguistics and "bearing arms"

The Washington Post published an opinion piece earlier today by Dennis Baron, with the self-explanatory title "Antonin Scalia was wrong about the meaning of ‘bear arms.’" The crux of the article:

By Scalia’s logic, the natural meaning of “bear arms” is simply to carry a weapon and has nothing to do with armies. He explained in his opinion: “Although [bear arms] implies that the carrying of the weapon is for the purpose of ‘offensive or defensive action,’ it in no way connotes participation in a structured military organization. From our review of founding-era sources, we conclude that this natural meaning was also the meaning that ‘bear arms’ had in the 18th century. In numerous instances, ‘bear arms’ was unambiguously used to refer to the carrying of weapons outside of an organized militia.”

But Scalia was wrong. Two new databases of English writing from the founding era confirm that “bear arms” is a military term. Non-military uses of “bear arms” are not just rare — they’re almost nonexistent.

A search of Brigham Young University’s new online Corpus of Founding Era American English, with more than 95,000 texts and 138 million words, yields 281 instances of the phrase “bear arms.” BYU’s Corpus of Early Modern English, with 40,000 texts and close to 1.3 billion words, shows 1,572 instances of the phrase. Subtracting about 350 duplicate matches, that leaves about 1,500 separate occurrences of “bear arms” in the 17th and 18th centuries, and only a handful don’t refer to war, soldiering or organized, armed action. These databases confirm that the natural meaning of “bear arms” in the framers’ day was military.

Read the rest of this entry »

Comments (14)

The BYU Law corpora (updated)

[Cross-posted on LAWnLinguistics.]

I’d imagine that most people who’ve been actively involved with corpus linguistics are familiar with the BYU corpora—a collection of web-accessible corpora created by Brigham Young University linguistics professor Mark Davies. These corpora (and BYU’s corpus-linguistics program more generally) have played an essential part in the development of what I’ll call the corpus-linguistic turn in legal interpretation. The BYU corpora served as my entry-point into corpus linguistics, and they have provided the corpus data that has been used in most of the law-and-corpus-linguistics work that has been done to date. And beyond that, the BYU Law School has played an enormous role, in a variety of ways, in Law and Corpus Linguistics becoming a thing.

One of the things that the law school has been doing has been happening largely behind the scenes. For the past two or three years, people there have been developing the Corpus of Founding Era American English (COFEA)—a historical corpus that is intended as resource for studying language usage in the time leading up to the drafting and ratification of the U.S. Constitution. At this year’s conference on law and corpus linguistics (the third such conference, all of them hosted by the BYU Law School), we were given a preview of COFEA. And via a tweet by the law school’s dean, Gordon Smith, I’ve now learned that a beta version of COFEA is up and available for public playing-around-with, as are beta versions of two other corpora: the Corpus of Early Modern English and the Corpus of Supreme Court of the United States.

Read the rest of this entry »

Comments (8)

Of dogs and Old Sinitic reconstructions

At the conclusion of "Barking roosters and crowing dogs" (2/18/18), I promised a more philologically oriented post to celebrate the advent of the lunar year of the dog.  This is it.  Concurrently, it is part of this long running series on Old Sinitic and Indo-European comparative reconstructions:

I will launch into this post with the following simple prefatory statement:

Half a century ago, the first time I encountered the Old Sinitic reconstruction of Mandarin quǎn 犬 ("dog"), Karlgren GSR 479 *k'iwən, I suspected that it might be related to an Indo-European word cognate with "canine" [<PIE *kwon-]).

Read the rest of this entry »

Comments (95)

Sinitic historical phonology

[Or, as David Prager Branner, who wrote the guest post below, jokingly calls it, "hysterical phrenology".  Note that Branner uses Gwoyeu Romatzyh ( "National Language Romanization"), a type of tonal spelling, for the transcription of Mandarin.]

================

This is on the subject of Carbo Kuo's 郭家寶 performance of Shyjing "Shyi yeou charngchuu 隰有萇楚" ("In the low wet grounds is the carambola tree") in Jenqjang Shanqfang's 鄭張尚芳 various antique reconstructions, sent to me by Victor Mair. It pleased me a lot. The issue is one of art, not scholarship, and it should be judged as art.

[VHM:  must hear]

Read the rest of this entry »

Comments (12)

No creoles?

Damián Blasi, Susanne Michaelis and Martin Haspelmath, "Grammars are robustly transmitted even during the emergence of creole languages", Nature Human Behaviour 2017:

[W]e analyse 48 creole languages and 111 non-creole languages from all continents and conclude that the similarities (and differences) between creoles can be explained by genealogical and contact processes, as with non-creole languages, with the difference that creoles have more than one language in their ancestry. While a creole profile can be detected statistically, this stems from an over-representation of Western European and West African languages in their context of emergence. Our findings call into question the existence of a pidgin stage in creole development and of creole-specific innovations. In general, given their extreme conditions of emergence, they lend support to the idea that language learning and transmission are remarkably resilient processes.

Email from Damián Blasi puts it more bluntly:

The basic conclusions are that 1) creoles clearly continue the linguistic structure of the languages that preceded them, 2) we don't have any evidence for a pidgin stage preceding creoles and 3) no evidence for purely creole features (like SVO) whatsoever.

Read the rest of this entry »

Comments (37)

Genetic evidence for the spread of Indo-Aryan languages

My own investigations on the Bronze Age and Early Iron Age peoples of Eastern Central Asia (ECA) began essentially as a genetics cum linguistics project back in the early 90s.  That was not long after the extraction of mtDNA (mitochondrial DNA) from ancient human tissues and its amplification by means of PCR (polymerase chain reaction) became possible.

Read the rest of this entry »

Comments (4)

Cantonese sentence-final particles

Even if you don't know any Cantonese but listen carefully to people speaking it, you probably can tell that it has an abundance of particles.  For speakers of Mandarin who do not understand Cantonese, the proliferation of particles, especially in utterance final position, is conspicuous.  Non-speakers of Cantonese, confronted by all these aa3, ge3, gaa3, laa1, lo1, mei6, sin1, tim1, and so on naturally wonder why there are so many particles in this language, what are their various functions, why they are often drawn out (elongated), and how they arose.

Cantonese speakers, on the other hand, just take them in stride as a natural part of their expressive equipment and don't think that there is anything unusual about them.

Read the rest of this entry »

Comments (19)

Inflection in Georgian and in English

Helen Sims-Williams has a new post on The Philological Society Blog:

"Understanding the loss of inflection" (11/23/16)

Helen takes what might superficially seem to be a dry and dreary topic and turns it into a lively, stimulating essay.  Here's how it begins:

Read the rest of this entry »

Comments (15)

Eurasian eureka

After reading the the latest series of Language Log posts on long range connections (see below for a listing), Geoff Wade suggested that I title the next post in this series as I have this one.  If there ever was an occasion to do so, now is as good a moment as any, with the announcement of the publication of Chau Wu's extraordinary "Patterns of Sound Correspondence between Taiwanese and Germanic/Latin/Greek/Romance Lexicons, Part I", Sino-Platonic Papers, 262 (Aug., 2016), 239 pp. (free pdf).

Read the rest of this entry »

Comments (27)