Archive for Lexicon and lexicography

Corpora and the Second Amendment: “keep” (part 2)

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

In  my last post (longer ago than I care to admit), I offered a very brief introduction to corpus analysis and used corpus data on the word keep as the raw material for a demonstration of corpus analysis in action. One of my reasons for doing that was to talk about the approach to word meaning that I think is appropriate when using corpus linguistics in legal interpretation.

Read the rest of this entry »

Comments (5)

A new and useful dictionary of Sinographs

We have often noted how much easier it is to learn Chinese now than it was just ten or twenty years ago.  That's because of all the new digital resources that have become available in recent years:

Of course, there are a lot quick fix programs out there, and one should be wary of them:

But every so often a really good resource comes along, and I should like to introduce one such in this post.

Read the rest of this entry »

Comments (18)

The concept of word in Sinitic

In the following posts, we've been tackling the thorny, multifaceted question of whether Vietnamese has words and lexemes, as opposed to having syllables and morphemes:

During the course of our discussions, the parallel question of whether Sinitic had words or not also came up.  Let me put it this way:  although there was no concept of "word" in Sinitic before the 20th century, there were Sinitic words, going all the way back to the oracle bone inscriptions (the first stage of Chinese writing) more than three thousand years ago, as documented in these posts and dozens of others:

Read the rest of this entry »

Comments (11)

Words in Vietnamese

In "Diacriticless Vietnamese on a sign in San Francisco" (9/30/18), we discussed the advisability of joining syllables into words or separating all syllables.  The ensuing string of comments revealed that there is a correlation between linking syllables and word spacing on the one hand and the necessity for diacritical marks on the other hand.

This prompted me to ask the following questions of several colleagues who are specialists on Vietnamese:

Roughly what percentage of Vietnamese lexemes (words) are monosyllabic? Disyllabic? Any trisyllabic or higher?

The average length of a word in Mandarin is almost exactly two syllables.

Can you think of examples in Vietnamese parsing where it would be clearer or more helpful to have the syllables of words joined together?

Read the rest of this entry »

Comments (34)

Baby talk, part 2

Two days ago, I was sitting in a Panera around lunch time.  Next to me was a mother with two young daughters.  One of them looked to be about four years old, and the other about one and a half year old.

The girls were both well behaved, and I enjoyed their company for more than an hour.  Without intentionally eavesdropping, I could not but overhear what they were talking about.  After half an hour, I started to become amused by the younger daughter's speech, because it consisted entirely of the following three words:

1. no! — falling intonation

2. what? — rising intonation

3. why!? — half-falling then half-rising, sounding somewhat plaintive and querulous

Read the rest of this entry »

Comments (17)

Draconian dictionaries?

Rachel Paige King ("The Draconian Dictionary Is Back", The Atlantic 8/5/2018) suggests that lexicographers might be (re)turning to prescriptivism:

Since the 1960s, the reference book has cataloged how people actually use language, not how they should. That might be changing. […]

The standard way of describing these two approaches in lexicography is to call them “descriptivist” and “prescriptivist.” Descriptivist lexicographers, steeped in linguistic theory, eschew value judgements about so-called correct English and instead describe how people are using the language. Prescriptivists, by contrast, inform readers which usage is “right” and which is “wrong.”

King's historical sketch of lexicography's past century concludes that the descriptivists have won, but that

oddly enough, Merriam-Webster is doing a great deal to promote the idea that sounding educated and using standard—if not highbrow—English really does matter. […]

The company has a feisty blog and Twitter feed that it uses to criticize linguistic and grammatical choices. Donald Trump and his administration are regular catalysts for social-media clarifications by Merriam-Webster. The company seems bothered when Trump and his associates change the meanings of words for their own convenience, or when they debase the language more generally.

Maybe it’s not the dictionary that has become outmoded today, but descriptivism itself. I’m not implying that Merriam-Webster has or should abandon the philosophy that guides its lexicography, but it seems that the way the company has regained its relevance in the post-print era is by having a strong opinions about how people should use English.

Read the rest of this entry »

Comments (6)

Corpora and the Second Amendment: 'keep' (part 1)

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

With this post, I begin my examination of the corpus data regarding the phrase keep and bear arms. My plan is to start at the level of the individual words, keep, bear, and arms, then proceed to the simple verb phrases keep arms and bear arms, and finally deal with the whole phrase keep and bear arms. I start in this post and the next one with keep.

As you may recall from my last post about the Second Amendment, Justice Scalia's majority opinion in D.C. v. Heller had this to say about the meaning of keep: "[Samuel] Johnson defined 'keep' as, most relevantly, '[t]o retain; not to lose,' and '[t]o have in custody.' Webster defined it as '[t]o hold; to retain in one's power or possession.'" While those definitions could be improved on, I think that for purposes of this discussion, they adequately explain what keep means when it's used in the phrase keep arms. So I'm not going to discuss that data with an eye to criticizing this portion of the Heller opinion.

Instead, I'm going to use the data for keep as the raw material for an introduction to the nuts and bolts of corpus analysis. I suspect that many people reading this won't have had any first-hand experience working with corpus data, or even any exposure to it. Hopefully this quick introduction will enable those people to better understand what I'm talking about when I start to deal with the data that does raise questions about the Supreme Court's analysis.

Read the rest of this entry »

Comments (14)

Long words

I'm in Hamburg for lectures and meetings this week.

The first day I was here, in the afternoon I went out for a walk.  After taking about 50 steps from the front door of my hotel, I saw this lettering on the glass facade of a nearby building:

Read the rest of this entry »

Comments (64)

Character crises

From Bob Bauer:

You may have heard that the famous HK-based novelist by the name of 劉以鬯 recently passed away at the age of 99.  [VHM:  I have intentionally left his name without transcription for reasons that will soon become apparent.]

I did not know how to read/pronounce the third character in his name, so I tried to look it up in some dictionaries. But I first needed to decide what is this character's radical? Trying to find the character by its radical turned out to be a very time-consuming process, as different dictionaries do different things with it — at least one doesn't bother to assign it to a radical.

Read the rest of this entry »

Comments (60)

Corpora and the Second Amendment: Weisberg responds to me; plus update re OED

[An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here.]

Two quick updates.

First, David Weisberg has replied to my response to his post on the Originalism Blog, but he doesn't address the point that I made, which was that I disagreed with his framing of the issue.

Weisberg also notes that I didn't respond to the second point in his original post (which dealt with a purely legal issue), and he goes on to say this:

Many people (and I think Goldfarb is one) believe the correct sense of the 2nd Amend is this: “The right of the people to keep and bear Arms, for use in a State’s well regulated Militia, shall not be infringed.” But, if that is what the framers meant, why isn’t that what they wrote? I think that is a very fair question to ask, and it merits an answer. After all, 5 words would have been saved. Will corpus linguistics provide an answer?

I'm not going to offer any views in this series of posts about how I think the Second Amendment as a whole should be interpreted; I'm focusing only on Heller's interpretation of the phrase keep and bear arms. So I'm not going to say whether Weisberg is correct in his speculation about what I think on that score. Weisberg then asks why, if the framers had intended to convey the meaning he posits, they didn't write the amendment in those terms. Although Weisberg thinks that is "a very fair question to ask," I don't think it's a question that's relevant to the issue as the Court framed it in Heller, which had to do with how the Second Amendment's text was likely to have been understood by members of the public, not with what the framers intended. Nevertheless, I'll say that the question to which Weisberg wants an answer is not one that can be answered by corpus linguistics.

Read the rest of this entry »

Comments off

Corpora and the Second Amendment: Responding to Weisberg on the meaning of "bear arms" [Updated, and updated again]

[An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here.]

The Originalism Blog has a guest post, by David Weisberg, taking issue with the conclusion in Dennis Baron's Washington Post op-ed that newly available evidence of historical usage shows that in District of Columbia v. Heller, Justice Scalia misinterpreted the phrase keep and bear arms. That's an issue that I wrote about yesterday ("The coming corpus-based reexamination of the Second Amendment") and that I'm going to be dealing with in a series of posts over the next several weeks.

One of Weisberg's arguments concerns a linguistic issue that I'm planning to address, and I think that Weisberg is mistaken. At the risk of getting out ahead of myself, I want to respond to Weisberg briefly now, with a more detailed explanation to come.

Read the rest of this entry »

Comments (36)

The coming corpus-based reexamination of the Second Amendment [Updated]

[An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here.]

[Update: Broken link fixed; description of uploaded data corrected.]

It was only three weeks ago that BYU Law School made available two corpora that are intended to provide corpus-linguistic resources for researching the original meaning of the U.S. Constitution. And already the corpora are yielding results that could be very important.

The two corpora are COFEA (the Corpus of Founding Era American English) and COEME (the Corpus of Early Modern English). As I've previously explained, COFEA consists of almost 139 million words, drawn from more than 95,000 texts from the period 1760–1799, and COEME consists of 1.28 billion words, from 40,000 texts dating to the period 1475–1800. (The two corpora can be accessed here.)

Within a day after COFEA and COEME became available, Dennis Baron looked at data from the two corpora, to see what they revealed about the meaning of the key phrase in the Second Amendment: keep and bear arms. (Baron was one of the signatories to the linguists' amicus brief in District of Columbia v. Heller.) He announced his findings here on Language Log, in a comment on my post about the corpora's unveiling:

Sorry, J. Scalia, you got it wrong in Heller. I just ran "bear arms" through BYU's EMne [=Early Modern English] and Founding Era American English corpora, and of about 1500 matches (not counting the duplicates), all but a handful are clearly military.

Two weeks later, Baron published an opinion piece in the Washington Post, titled "Antonin Scalia was wrong about the meaning of ‘bear arms’," in which he repeated the point he had made in his comment, and elaborated on it a little. Out of "about 1,500 separate occurrences of 'bear arms' in the 17th and 18th centuries," he said, "only a handful don’t refer to war, soldiering or organized, armed action." Based on that fact, Baron said that the two corpora "confirm that the natural meaning of 'bear arms' in the framers’ day was military."

My interest having been piqued, I decided to check out the corpus data myself.

Read the rest of this entry »

Comments (19)

An explosion of curation

From June Teufel Dreyer:

Have you noticed that suddenly “curated,” previously almost exclusively used to refer to museum exhibitions, is turning up everywhere? A talking head recently said she was “curating [her] thoughts,” the floral arrangements for a society wedding were described as “curated” by a local florist… and so on.

I have a feeling I’m going to soon dislike the word as much as I do “the perfect storm.”

Read the rest of this entry »

Comments (37)