Archive for Lexicon and lexicography

Unexpected "English Word of the Day"

On February 19, I received this notice from Oxford Dictionaries:

English Word of the Day from
Oxford Dictionaries

Your word for today is:

li

a Chinese unit of distance, equal to about 0.5 km (0.3 mile)

Click on the word to see its full entry, including example sentences and audio pronunciation.

Read the rest of this entry »

Comments (35)

Corpora and the Second Amendment: “arms”

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

This post on what arms means will follow the pattern of my post on bear. I’ll start by reviewing what the Supreme Court said about the topic in District of Columbia v. Heller. I’ll then turn to the Oxford English Dictionary for a look at how arms was used over the history of English up through the end of the 18th century, when the Second Amendment was proposed and ratified.. And finally, I’ll discuss the corpus data.

Justice Scalia’s majority opinion had this to say about what arms meant:

The 18th-century meaning [of arms] is no different from the meaning today. The 1773 edition of Samuel Johnson’s dictionary defined ‘‘arms’’ as ‘‘[w]eapons of offence, or armour of defence.’’ Timothy Cunningham’s important 1771 legal dictionary defined ‘‘arms’’ as ‘‘any thing that a man wears for his defence, or takes into his hands, or useth in wrath to cast at or strike another.’’ [citations omitted]

As was true of what Scalia said about the meaning of bear, this summary was basically correct as far as it went, but was also a major oversimplification.

Read the rest of this entry »

Comments (14)

A corpus-linguistic take on "emolument(s)" (updated)

From the Washington Post:

The study is a corpus analysis performed by Jesse Egbert, a corpus linguist at Northern Arizona University and Clark Cunningham, a law professor who did work in law and linguistics from the late 1980s through the mid-1990s (link, link, link, link), including co-authoring an article with Chuck Fillmore that was what really opened my eyes to the power of linguistics in analyzing issues of word meaning.

Read the rest of this entry »

Comments (23)

Corpora and the Second Amendment: “bear”

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

Starting with this post, I’m (finally) getting to the meat of what I’ve called “the coming corpus-based reexamination of the Second Amendment.” The plan, as I’ve said before, is to more or less mirror the structure of the Supreme Court’s analysis of keep and bear arms. This post will focus on bear, and subsequent posts will focus separately on arms, bear arms, and keep and bear arms; I won’t be separately discussing keep arms because I have nothing to say about it. [Update: If you're confused about why I'm following this approach, as one of the commenters was, I've offered an explanation at the end of the post.]

In discussing the meaning of the verb bear, Justice Scalia’s majority opinion in District of Columbia v. Heller said, “At the time of the founding, as now, to ‘bear’ meant to ‘carry.’’’ That statement was backed up by citations to distinguished lexicographic authority—Samuel Johnson, Noah Webster, Thomas Sheridan, and the OED—but evidence that was not readily available when Heller was decided shows that Scalia’s statement was very much an oversimplification. Although bear was sometimes used in the way that Scalia described, it was not synonymous with carry and its overall pattern of use was quite different.

Read the rest of this entry »

Comments (13)

Sino-Vietnamese vocabulary in a patriotic slogan

Comments (13)

Dungan-English dictionary

We have had several posts about Dungan on Language Log:

"Dungan: a Sinitic language written with the Cyrillic alphabet" (4/20/13)

"'Jesus' in Dungan" (7/16/14)

"Writing Sinitic languages with phonetic scripts" (5/20/16)

See also:

Implications of the Soviet Dungan Script for Chinese Language Reform.

Omniglot

The reason I have been interested in Dungan for the last four decades and more is that it constitutes prima facie evidence that a Sinitic language that had never before been written in Sinographs can be written in an alphabetical script, even without the indication of tones.  Relying on separation of words with spaces, punctuation, etc., the Dungans have used their script to write poetry, essays newspaper articles, and so on.

Read the rest of this entry »

Comments (11)

Corpora and the Second Amendment: “keep” (part 2)

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

In  my last post (longer ago than I care to admit), I offered a very brief introduction to corpus analysis and used corpus data on the word keep as the raw material for a demonstration of corpus analysis in action. One of my reasons for doing that was to talk about the approach to word meaning that I think is appropriate when using corpus linguistics in legal interpretation.

Read the rest of this entry »

Comments (9)

A new and useful dictionary of Sinographs

We have often noted how much easier it is to learn Chinese now than it was just ten or twenty years ago.  That's because of all the new digital resources that have become available in recent years:

Of course, there are a lot quick fix programs out there, and one should be wary of them:

But every so often a really good resource comes along, and I should like to introduce one such in this post.

Read the rest of this entry »

Comments (35)

The concept of word in Sinitic

In the following posts, we've been tackling the thorny, multifaceted question of whether Vietnamese has words and lexemes, as opposed to having syllables and morphemes:

During the course of our discussions, the parallel question of whether Sinitic had words or not also came up.  Let me put it this way:  although there was no concept of "word" in Sinitic before the 20th century, there were Sinitic words, going all the way back to the oracle bone inscriptions (the first stage of Chinese writing) more than three thousand years ago, as documented in these posts and dozens of others:

Read the rest of this entry »

Comments (10)

Words in Vietnamese

In "Diacriticless Vietnamese on a sign in San Francisco" (9/30/18), we discussed the advisability of joining syllables into words or separating all syllables.  The ensuing string of comments revealed that there is a correlation between linking syllables and word spacing on the one hand and the necessity for diacritical marks on the other hand.

This prompted me to ask the following questions of several colleagues who are specialists on Vietnamese:

Roughly what percentage of Vietnamese lexemes (words) are monosyllabic? Disyllabic? Any trisyllabic or higher?

The average length of a word in Mandarin is almost exactly two syllables.

Can you think of examples in Vietnamese parsing where it would be clearer or more helpful to have the syllables of words joined together?

Read the rest of this entry »

Comments (34)

Baby talk, part 2

Two days ago, I was sitting in a Panera around lunch time.  Next to me was a mother with two young daughters.  One of them looked to be about four years old, and the other about one and a half year old.

The girls were both well behaved, and I enjoyed their company for more than an hour.  Without intentionally eavesdropping, I could not but overhear what they were talking about.  After half an hour, I started to become amused by the younger daughter's speech, because it consisted entirely of the following three words:

1. no! — falling intonation

2. what? — rising intonation

3. why!? — half-falling then half-rising, sounding somewhat plaintive and querulous

Read the rest of this entry »

Comments (17)

Draconian dictionaries?

Rachel Paige King ("The Draconian Dictionary Is Back", The Atlantic 8/5/2018) suggests that lexicographers might be (re)turning to prescriptivism:

Since the 1960s, the reference book has cataloged how people actually use language, not how they should. That might be changing. […]

The standard way of describing these two approaches in lexicography is to call them “descriptivist” and “prescriptivist.” Descriptivist lexicographers, steeped in linguistic theory, eschew value judgements about so-called correct English and instead describe how people are using the language. Prescriptivists, by contrast, inform readers which usage is “right” and which is “wrong.”

King's historical sketch of lexicography's past century concludes that the descriptivists have won, but that

oddly enough, Merriam-Webster is doing a great deal to promote the idea that sounding educated and using standard—if not highbrow—English really does matter. […]

The company has a feisty blog and Twitter feed that it uses to criticize linguistic and grammatical choices. Donald Trump and his administration are regular catalysts for social-media clarifications by Merriam-Webster. The company seems bothered when Trump and his associates change the meanings of words for their own convenience, or when they debase the language more generally.

Maybe it’s not the dictionary that has become outmoded today, but descriptivism itself. I’m not implying that Merriam-Webster has or should abandon the philosophy that guides its lexicography, but it seems that the way the company has regained its relevance in the post-print era is by having a strong opinions about how people should use English.

Read the rest of this entry »

Comments (6)

Corpora and the Second Amendment: 'keep' (part 1)

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

With this post, I begin my examination of the corpus data regarding the phrase keep and bear arms. My plan is to start at the level of the individual words, keep, bear, and arms, then proceed to the simple verb phrases keep arms and bear arms, and finally deal with the whole phrase keep and bear arms. I start in this post and the next one with keep.

As you may recall from my last post about the Second Amendment, Justice Scalia's majority opinion in D.C. v. Heller had this to say about the meaning of keep: "[Samuel] Johnson defined 'keep' as, most relevantly, '[t]o retain; not to lose,' and '[t]o have in custody.' Webster defined it as '[t]o hold; to retain in one's power or possession.'" While those definitions could be improved on, I think that for purposes of this discussion, they adequately explain what keep means when it's used in the phrase keep arms. So I'm not going to discuss that data with an eye to criticizing this portion of the Heller opinion.

Instead, I'm going to use the data for keep as the raw material for an introduction to the nuts and bolts of corpus analysis. I suspect that many people reading this won't have had any first-hand experience working with corpus data, or even any exposure to it. Hopefully this quick introduction will enable those people to better understand what I'm talking about when I start to deal with the data that does raise questions about the Supreme Court's analysis.

Read the rest of this entry »

Comments (11)