Archive for Lexicon and lexicography

Insect name

How would you respond in your native language if someone walked up to you and asked (in your native language or in English or some other language which both of you know), "What's the word for 'the insect that eats wood and destroys walls'?".

A friend of mine in China did that with eight of his colleagues, and not a single one of them could remember the Chinese name for "the insect that eats wood and destroys walls".

Read the rest of this entry »

Comments (45)

Digitizing specialized language dictionaries

[The following is a guest post by David Dettmann.  The "Schwarz Uyghur dictionary" to which he refers in the third paragraph is this:  Henry G. Schwarz, An Uyghur-English dictionary (Bellingham, Washington:  Center for East Asian Studies, Western Washington University, 1992).]


It is a bit of a nerdy obsession of mine to customize my computers to comfortably use languages that I've studied.

About 10 years ago, I got relatively proficient with using optical character recognition (OCR) software and scanner hardware. Any time I found an essential dictionary for the languages I studied, I converted them to unicode OCR scans in pdf format (i.e., converting images of pages to text). I later used that data to create dictionary content files that would work together with the Mac OS dictionary application. I did this process with several dictionaries that I found essential while I studied Kazakh, Uzbek, and Uyghur.

This process was particularly useful for me to use the Schwarz Uyghur dictionary. I could not get used to the alphabetical order that he favored (which was different from typical Latin order AND Uyghur Arabic script order). As a result, any lookup would just take forever. That said, the formatting of each page was quite pleasant, and there were some nice illustrations of plants of traditional Uyghur medicine as well as handy keys at the bottom of each page to explain abbreviations.

Read the rest of this entry »

Comments (5)

Things you can do with "water" in Cantonese

Peter Golden sent me the following video, "Luisa Tam says: Let's put more HK English on the map", South China Morning Post (10/23/18):

Read the rest of this entry »

Comments (5)

Unexpected "English Word of the Day"

On February 19, I received this notice from Oxford Dictionaries:

English Word of the Day from
Oxford Dictionaries

Your word for today is:

li

a Chinese unit of distance, equal to about 0.5 km (0.3 mile)

Click on the word to see its full entry, including example sentences and audio pronunciation.

Read the rest of this entry »

Comments (35)

Corpora and the Second Amendment: "arms"

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

New URL for COFEA and COEME: https://lawcorpus.byu.edu.

This post on what arms means will follow the pattern of my post on bear. I'll start by reviewing what the Supreme Court said about the topic in District of Columbia v. Heller. I'll then turn to the Oxford English Dictionary for a look at how arms was used over the history of English up through the end of the 18th century, when the Second Amendment was proposed and ratified.. And finally, I'll discuss the corpus data.

Justice Scalia's majority opinion had this to say about what arms meant:

The 18th-century meaning [of arms] is no different from the meaning today. The 1773 edition of Samuel Johnson's dictionary defined ''arms'' as ''[w]eapons of offence, or armour of defence.'' Timothy Cunningham's important 1771 legal dictionary defined ''arms'' as ''any thing that a man wears for his defence, or takes into his hands, or useth in wrath to cast at or strike another.'' [citations omitted]

As was true of what Scalia said about the meaning of bear, this summary was basically correct as far as it went, but was also a major oversimplification.

Read the rest of this entry »

Comments (14)

A corpus-linguistic take on "emolument(s)" (updated)

From the Washington Post:

The study is a corpus analysis performed by Jesse Egbert, a corpus linguist at Northern Arizona University and Clark Cunningham, a law professor who did work in law and linguistics from the late 1980s through the mid-1990s (link, link, link, link), including co-authoring an article with Chuck Fillmore that was what really opened my eyes to the power of linguistics in analyzing issues of word meaning.

Read the rest of this entry »

Comments (23)

Corpora and the Second Amendment: "bear"

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

New URL for COFEA and COEME: https://lawcorpus.byu.edu.

Starting with this post, I'm (finally) getting to the meat of what I've called "the coming corpus-based reexamination of the Second Amendment." The plan, as I've said before, is to more or less mirror the structure of the Supreme Court's analysis of keep and bear arms. This post will focus on bear, and subsequent posts will focus separately on arms, bear arms, and keep and bear arms; I won't be separately discussing keep arms because I have nothing to say about it. [Update: If you're confused about why I'm following this approach, as one of the commenters was, I've offered an explanation at the end of the post.]

In discussing the meaning of the verb bear, Justice Scalia's majority opinion in District of Columbia v. Heller said, "At the time of the founding, as now, to 'bear' meant to 'carry.''' That statement was backed up by citations to distinguished lexicographic authority—Samuel Johnson, Noah Webster, Thomas Sheridan, and the OED—but evidence that was not readily available when Heller was decided shows that Scalia's statement was very much an oversimplification. Although bear was sometimes used in the way that Scalia described, it was not synonymous with carry and its overall pattern of use was quite different.

Read the rest of this entry »

Comments (13)

Sino-Vietnamese vocabulary in a patriotic slogan

Comments (13)

Dungan-English dictionary

We have had several posts about Dungan on Language Log:

"Dungan: a Sinitic language written with the Cyrillic alphabet" (4/20/13)

"'Jesus' in Dungan" (7/16/14)

"Writing Sinitic languages with phonetic scripts" (5/20/16)

See also:

Implications of the Soviet Dungan Script for Chinese Language Reform.

Omniglot

The reason I have been interested in Dungan for the last four decades and more is that it constitutes prima facie evidence that a Sinitic language that had never before been written in Sinographs can be written in an alphabetical script, even without the indication of tones.  Relying on separation of words with spaces, punctuation, etc., the Dungans have used their script to write poetry, essays newspaper articles, and so on.

Read the rest of this entry »

Comments (11)

Corpora and the Second Amendment: "keep" (part 2)

An introduction and guide to my series of posts "Corpora and the Second Amendment" is available here. The corpus data that is discussed can be downloaded here. That link will take you to a shared folder in Dropbox. Important: Use the "Download" button at the top right of the screen.

In  my last post (longer ago than I care to admit), I offered a very brief introduction to corpus analysis and used corpus data on the word keep as the raw material for a demonstration of corpus analysis in action. One of my reasons for doing that was to talk about the approach to word meaning that I think is appropriate when using corpus linguistics in legal interpretation.

Read the rest of this entry »

Comments (9)

A new and useful dictionary of Sinographs

We have often noted how much easier it is to learn Chinese now than it was just ten or twenty years ago.  That's because of all the new digital resources that have become available in recent years:

Of course, there are a lot quick fix programs out there, and one should be wary of them:

But every so often a really good resource comes along, and I should like to introduce one such in this post.

Read the rest of this entry »

Comments (35)

The concept of word in Sinitic

In the following posts, we've been tackling the thorny, multifaceted question of whether Vietnamese has words and lexemes, as opposed to having syllables and morphemes:

During the course of our discussions, the parallel question of whether Sinitic had words or not also came up.  Let me put it this way:  although there was no concept of "word" in Sinitic before the 20th century, there were Sinitic words, going all the way back to the oracle bone inscriptions (the first stage of Chinese writing) more than three thousand years ago, as documented in these posts and dozens of others:

Read the rest of this entry »

Comments (10)

Words in Vietnamese

In "Diacriticless Vietnamese on a sign in San Francisco" (9/30/18), we discussed the advisability of joining syllables into words or separating all syllables.  The ensuing string of comments revealed that there is a correlation between linking syllables and word spacing on the one hand and the necessity for diacritical marks on the other hand.

This prompted me to ask the following questions of several colleagues who are specialists on Vietnamese:

Roughly what percentage of Vietnamese lexemes (words) are monosyllabic? Disyllabic? Any trisyllabic or higher?

The average length of a word in Mandarin is almost exactly two syllables.

Can you think of examples in Vietnamese parsing where it would be clearer or more helpful to have the syllables of words joined together?

Read the rest of this entry »

Comments (34)