Corpora and the Second Amendment: “bear”

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

Starting with this post, I’m (finally) getting to the meat of what I’ve called “the coming corpus-based reexamination of the Second Amendment.” The plan, as I’ve said before, is to more or less mirror the structure of the Supreme Court’s analysis of keep and bear arms. This post will focus on bear, and subsequent posts will focus separately on arms, bear arms, and keep and bear arms; I won’t be separately discussing keep arms because I have nothing to say about it. [Update: If you're confused about why I'm following this approach, as one of the commenters was, I've offered an explanation at the end of the post.]

In discussing the meaning of the verb bear, Justice Scalia’s majority opinion in District of Columbia v. Heller said, “At the time of the founding, as now, to ‘bear’ meant to ‘carry.’’’ That statement was backed up by citations to distinguished lexicographic authority—Samuel Johnson, Noah Webster, Thomas Sheridan, and the OED—but evidence that was not readily available when Heller was decided shows that Scalia’s statement was very much an oversimplification. Although bear was sometimes used in the way that Scalia described, it was not synonymous with carry and its overall pattern of use was quite different.

Sino-Vietnamese vocabulary in a patriotic slogan

Dungan-English dictionary

We have had several posts about Dungan on Language Log:

"Dungan: a Sinitic language written with the Cyrillic alphabet" (4/20/13)

"'Jesus' in Dungan" (7/16/14)

"Writing Sinitic languages with phonetic scripts" (5/20/16)

See also:

Implications of the Soviet Dungan Script for Chinese Language Reform.


The reason I have been interested in Dungan for the last four decades and more is that it constitutes prima facie evidence that a Sinitic language that had never before been written in Sinographs can be written in an alphabetical script, even without the indication of tones.  Relying on separation of words with spaces, punctuation, etc., the Dungans have used their script to write poetry, essays newspaper articles, and so on.

Corpora and the Second Amendment: “keep” (part 2)

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

In  my last post (longer ago than I care to admit), I offered a very brief introduction to corpus analysis and used corpus data on the word keep as the raw material for a demonstration of corpus analysis in action. One of my reasons for doing that was to talk about the approach to word meaning that I think is appropriate when using corpus linguistics in legal interpretation.

A new and useful dictionary of Sinographs

We have often noted how much easier it is to learn Chinese now than it was just ten or twenty years ago.  That's because of all the new digital resources that have become available in recent years:

Of course, there are a lot quick fix programs out there, and one should be wary of them:

But every so often a really good resource comes along, and I should like to introduce one such in this post.

The concept of word in Sinitic

In the following posts, we've been tackling the thorny, multifaceted question of whether Vietnamese has words and lexemes, as opposed to having syllables and morphemes:

During the course of our discussions, the parallel question of whether Sinitic had words or not also came up.  Let me put it this way:  although there was no concept of "word" in Sinitic before the 20th century, there were Sinitic words, going all the way back to the oracle bone inscriptions (the first stage of Chinese writing) more than three thousand years ago, as documented in these posts and dozens of others:

Words in Vietnamese

In "Diacriticless Vietnamese on a sign in San Francisco" (9/30/18), we discussed the advisability of joining syllables into words or separating all syllables.  The ensuing string of comments revealed that there is a correlation between linking syllables and word spacing on the one hand and the necessity for diacritical marks on the other hand.

This prompted me to ask the following questions of several colleagues who are specialists on Vietnamese:

Roughly what percentage of Vietnamese lexemes (words) are monosyllabic? Disyllabic? Any trisyllabic or higher?

The average length of a word in Mandarin is almost exactly two syllables.

Can you think of examples in Vietnamese parsing where it would be clearer or more helpful to have the syllables of words joined together?

Baby talk, part 2

Two days ago, I was sitting in a Panera around lunch time.  Next to me was a mother with two young daughters.  One of them looked to be about four years old, and the other about one and a half year old.

The girls were both well behaved, and I enjoyed their company for more than an hour.  Without intentionally eavesdropping, I could not but overhear what they were talking about.  After half an hour, I started to become amused by the younger daughter's speech, because it consisted entirely of the following three words:

1. no! — falling intonation

2. what? — rising intonation

3. why!? — half-falling then half-rising, sounding somewhat plaintive and querulous

Draconian dictionaries?

Rachel Paige King ("The Draconian Dictionary Is Back", The Atlantic 8/5/2018) suggests that lexicographers might be (re)turning to prescriptivism:

Since the 1960s, the reference book has cataloged how people actually use language, not how they should. That might be changing. […]

The standard way of describing these two approaches in lexicography is to call them “descriptivist” and “prescriptivist.” Descriptivist lexicographers, steeped in linguistic theory, eschew value judgements about so-called correct English and instead describe how people are using the language. Prescriptivists, by contrast, inform readers which usage is “right” and which is “wrong.”

King's historical sketch of lexicography's past century concludes that the descriptivists have won, but that

oddly enough, Merriam-Webster is doing a great deal to promote the idea that sounding educated and using standard—if not highbrow—English really does matter. […]

The company has a feisty blog and Twitter feed that it uses to criticize linguistic and grammatical choices. Donald Trump and his administration are regular catalysts for social-media clarifications by Merriam-Webster. The company seems bothered when Trump and his associates change the meanings of words for their own convenience, or when they debase the language more generally.

Maybe it’s not the dictionary that has become outmoded today, but descriptivism itself. I’m not implying that Merriam-Webster has or should abandon the philosophy that guides its lexicography, but it seems that the way the company has regained its relevance in the post-print era is by having a strong opinions about how people should use English.

Corpora and the Second Amendment: 'keep' (part 1)

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

With this post, I begin my examination of the corpus data regarding the phrase keep and bear arms. My plan is to start at the level of the individual words, keep, bear, and arms, then proceed to the simple verb phrases keep arms and bear arms, and finally deal with the whole phrase keep and bear arms. I start in this post and the next one with keep.

As you may recall from my last post about the Second Amendment, Justice Scalia's majority opinion in D.C. v. Heller had this to say about the meaning of keep: "[Samuel] Johnson defined 'keep' as, most relevantly, '[t]o retain; not to lose,' and '[t]o have in custody.' Webster defined it as '[t]o hold; to retain in one's power or possession.'" While those definitions could be improved on, I think that for purposes of this discussion, they adequately explain what keep means when it's used in the phrase keep arms. So I'm not going to discuss that data with an eye to criticizing this portion of the Heller opinion.

Instead, I'm going to use the data for keep as the raw material for an introduction to the nuts and bolts of corpus analysis. I suspect that many people reading this won't have had any first-hand experience working with corpus data, or even any exposure to it. Hopefully this quick introduction will enable those people to better understand what I'm talking about when I start to deal with the data that does raise questions about the Supreme Court's analysis.

Long words

I'm in Hamburg for lectures and meetings this week.

The first day I was here, in the afternoon I went out for a walk.  After taking about 50 steps from the front door of my hotel, I saw this lettering on the glass facade of a nearby building:

Character crises

From Bob Bauer:

You may have heard that the famous HK-based novelist by the name of 劉以鬯 recently passed away at the age of 99.  [VHM:  I have intentionally left his name without transcription for reasons that will soon become apparent.]

I did not know how to read/pronounce the third character in his name, so I tried to look it up in some dictionaries. But I first needed to decide what is this character's radical? Trying to find the character by its radical turned out to be a very time-consuming process, as different dictionaries do different things with it — at least one doesn't bother to assign it to a radical.

Corpora and the Second Amendment: Weisberg responds to me; plus update re OED

[An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here.]

Two quick updates.

First, David Weisberg has replied to my response to his post on the Originalism Blog, but he doesn't address the point that I made, which was that I disagreed with his framing of the issue.

Weisberg also notes that I didn't respond to the second point in his original post (which dealt with a purely legal issue), and he goes on to say this:

Many people (and I think Goldfarb is one) believe the correct sense of the 2nd Amend is this: “The right of the people to keep and bear Arms, for use in a State’s well regulated Militia, shall not be infringed.” But, if that is what the framers meant, why isn’t that what they wrote? I think that is a very fair question to ask, and it merits an answer. After all, 5 words would have been saved. Will corpus linguistics provide an answer?

I'm not going to offer any views in this series of posts about how I think the Second Amendment as a whole should be interpreted; I'm focusing only on Heller's interpretation of the phrase keep and bear arms. So I'm not going to say whether Weisberg is correct in his speculation about what I think on that score. Weisberg then asks why, if the framers had intended to convey the meaning he posits, they didn't write the amendment in those terms. Although Weisberg thinks that is "a very fair question to ask," I don't think it's a question that's relevant to the issue as the Court framed it in Heller, which had to do with how the Second Amendment's text was likely to have been understood by members of the public, not with what the framers intended. Nevertheless, I'll say that the question to which Weisberg wants an answer is not one that can be answered by corpus linguistics.

