Archive for July, 2023

Central Asian Kharosthi script on an ancient knife hilt found in Austria

Astonishing demonstration of East-West interaction during Roman times (with an equally mind-boggling demonstration of the occasional, yet horrendous [defying common sense], ineptitude of AI translation):

"Geheimnis um Messergriff aus dem römerzeitlichen Wels gelüftet"

Ein vor über 100 Jahren entdeckter Elfenbeingriff mit rätselhafter Inschrift aus dem antiken Ovilava gehörte wohl einst einem Besucher aus dem fernen Asien

"The mystery of the Roman period Wels knife handle revealed"

An ivory handle with a mysterious inscription from ancient Ovilava discovered more than 100 years ago probably once belonged to a visitor from distant Asia

Thomas Bergmayr, Der Standard (7/28/23)

Before presenting the remarkable findings reported in this important article, just a short prefatory note about the AI translation of the title.  Three of the main online multilingual neural machine translation services (Google Translate, Baidu Fanyi, and DeepL) mistranslated "Wels" (the eighth largest city in Austria [ancient Ovilava]) as "catfish" (only Bing Translator got it right).  Given the object that we're dealing with, that is a genuinely bizarre rendering of the word, especially since the material of the handle is identified as ivory and the artifact as coming from Ovilaval in the subtitle.  (It is all the more perplexing that three of the four services are consistent in making the same strange mistake [well, not so strange after all, since "wels" really does mean catfish in German].)  Fortunately, the machine translators do a better job in the body of the article, where there is more context.

For the purposes of the rough translation of the German article, I have relied mainly on GT, with occasional assistance from the other translation services, and some good old human input from my own brain.  Please bear in mind that the translations proffered below do not pretend to be polished, flawless English renderings of parts of the German article, but only to give a functionally useful idea of its content.

N.B.:  Two photographs of the knife handle are provided near the bottom of this post.

Read the rest of this entry »

Comments (20)

Pronouncing literally

Commenting on yesterday's post "Semantic drift of the week", Nicholas wrote this about the pronunciation of different senses of the word battery:

In Australia and many parts of the UK, the pronunciation between both is significantly different.

"Batch-ry" holds the electrical charge.

Batt-ery is the criminal charge.

Pronouncing words like military, literally, and battery without making the "ch" sound (mili-chery') is a sign of an uneducated person..

Many other comments followed, discussing various pronunciations of these and similar words, along with their geographical, social, and lexical distributions.

This morning I'll ignore the interesting sociolinguistic aspects, except to note (as sociolinguists often remind us) that people's intuitions about when and why they say what are generally not very reliable, so that it's a good idea to check how people actually talk, including ourselves…

Instead I'll take a brief look at the phonetic issue under discussion.

Read the rest of this entry »

Comments (38)

Semantic drift of the week

…or maybe we should call it a "semantic jump"? It's a pun that illustrates how word meanings can evolve along sensible paths that become obscure as time passes and culture changes. Which is one of the reasons that the reconstruction of linguistic history gets harder as time depth increases.

Here's the upper scene in the latest Perry Bible Fellowship comic:

Read the rest of this entry »

Comments (49)

Where did the PIEs come from; when was that?


The language family began to diverge from around 8,100 years ago, out of a homeland immediately south of the Caucasus. One migration reached the Pontic-Caspian and Forest Steppe around 7,000 years ago, and from there subsequent migrations spread into parts of Europe around 5,000 years ago. Credit: P. Heggarty et al., Science (2023)

Read the rest of this entry »

Comments (48)

ROT-LLM?

There's a puzzling new proposal for watermarking AI-generated text — Alistair Croll, "To Watermark AI, It Needs Its Own Alphabet", Wired 7/27/2023:

We need a way to distinguish things made by humans from things made by algorithms, and we need it very soon. […]

Fortunately, we have a solution waiting in plain sight. […]

If the companies who pledged to watermark AI content at the point of origin do so using Unicode—essentially giving AI its own character set—we’ll have a ready-made, fine-grained AI watermark that works across all devices, platforms, operating systems, and websites.

Read the rest of this entry »

Comments (22)

Is there no / any longer a reason / need to learn a foreign language?, part 2

People, including serious linguists, are beginning to wonder:

John McWhorter, "Are translation apps making the learning of foreign languages obsolete?", NYT 7/25/2023

I remember a time, not too long ago, when John was making a serious effort to learn Mandarin, because he often asked me cogent questions about the language and wanted to know the best methods for learning it.

What we can learn from the Tower of Babel

In Europe, nine out of 10 students study a foreign language. In the United States, only one in five do. Between 1997 and 2008, the number of American middle schools offering foreign languages dropped from 75 percent to 58 percent. Between 2009 and 2013, one American college closed its foreign language program; between 2013 and 2017, 651 others did the same.

At first glance, these statistics look like a tragedy. But I am starting to harbor the odd opinion that maybe they are not. What is changing my mind is technology.

Read the rest of this entry »

Comments (19)

Mandarin pronouns

https://twitter.com/NvrBackDown24/status/1681075557700648962

Read the rest of this entry »

Comments (22)

Knowledge and skills contributed by enslaved Africans

The recent controversy about Florida's new State Academic Standards for Social Studies leaves something out, in my opinion. The point of contention is the assertion (p.6) that "Instruction includes how slaves developed skills which, in some instances, could be applied for their personal benefit". Critics have taken this as an inappropriate pitch for the benefits of slavery, evoking the "Slavery as Positive Good" viewpoint that was common in the American south before the Civil War.

Missing from the discussion is the fact that the transfer of crucial skills sometimes went in the other direction. In the 17th and 18th centuries, enslaved Africans brought with them the technology that enabled wet rice cultivation in South Carolina and Georgia. Needless to say, the British colonizers knew nothing at all about how to grow rice, especially in converted mangrove swamps. This imported technology led to lucrative rice-cultivation plantations that were essential to Britain's colonization of North America.

Read the rest of this entry »

Comments (2)

Mark Twain's new novel?

Today's Non Sequitur:


Read the rest of this entry »

Comments (14)

Buddhist ideas on Sanskrit-Chinese translation

[This is a guest post by Max Deeg.  Although the following text has profound implications for anyone who is seriously interested in the actualities of translation between two very different kinds of languages from antiquity, it is fundamentally a task for specialists to render this type of Middle Buddhist Hybrid Sinitic into English.  This is both because of the nature of the language itself and due to the fact that it is fairly lengthy.  Consequently, I will not provide phonetic annotations of the entire text, as is my usual practice for shorter passages on Language Log.]

 

Bianji on Sanskrit and Xuanzang as a translator.[1]

Introduction

The following passage is found in the twelfth chapter or fascicle (juan) of Xuanzang’s 玄奘 Datang Xiyu ji 大唐西域記 (Record of the Western Regions of the Great Tang) and is part of what I think is Bianji’s 辯機 (619-?) “Eulogy of the Record” (Jizan 記讚) added to the Record.[2]

The Datang Xiyu ji (Record of the Western Regions of the Great Tang) by the Chinese monk-pilgrim and translator Xuanzang (600?-664; travelled 629-645), arguably is one of the earliest Buddhist Chinese texts translated into a Western language and had an enormous impact on the historical research on Buddhism.[3] Originally written for the second Tang emperor Taizong 太宗 (598-649; ruled from 626) in less than one year after Xuanzang’s return from India in 645, the text gives information about the Central Asian regions Xuanzang travelled through on his journey to India (and back), about India and her different regions, with a focus on the state of Buddhism and its sacred places linked to the life of the Buddha and his disciples. Although the Record has mainly been used in a historicist-positivist fashion in modern scholarship, the text is a multifaceted complex work which contains several layers of “intentionality” that need to be taken into account carefully when reading and interpreting (hence also translating) the text. One of these intentional aspects is to “sell” Buddhism and the ideal of a Buddhist ruler to the Tang emperor.[4]

Read the rest of this entry »

Comments (2)

Radial dendrograms

From Sarah Gao and Andrew Gao, "On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models", arxiv.org 7/19/2023:

That's not a vinyl — it's a "radial dendrogram" — showing the evolutionary tree of nearly 6,000 Large Language Models posted at Hugging Face. Zeroing in on one quadrant, so you can read the labels:

Read the rest of this entry »

Comments (2)

The many meanings and faces of "vernacular"

During the first twenty years of my academic career, if anybody asked me what my specialty was, I would have told them something like "medieval popular Buddhist vernacular Chinese literature".  In that usage of "vernacular", which I thought was the standard meaning of the term, I simply considered it a register of language and writing that is distinct from and contrasted with "classical" or "literary", and — to my mind — it was parallel to "popular" or "folk" in a cultural spectrum that ran to "elite" at the other end (I was going to say "at the top", but — being a partisan of "popular" and "folk" — I caught myself).

In college, as an English major, being a specialist on the vernacular meant that I was enamored of Chaucer, and in graduate school and as a young Sinologist, it signified that I concentrated on the first sizable body of non-classical / literary texts archeologically recovered from the far western Chinese site of Dunhuang, concerning which we have often touched here on Language Log, especially in recent weeks.

Read the rest of this entry »

Comments (18)

Watermarking text?

Ashley Belanger, "OpenAI, Google will watermark AI-generated content to hinder deepfakes, misinfo", ars technica 7/21/2023:

Seven companies — including OpenAI, Microsoft, Google, Meta, Amazon, Anthropic, and Inflection —- have committed to developing tech to clearly watermark AI-generated content. That will help make it safer to share AI-generated text, video, audio, and images without misleading others about the authenticity of that content, the Biden administration hopes.

The link goes to a 7/21 White House with the title "FACT SHEET: Biden-⁠Harris Administration Secures Voluntary Commitments from Leading Artificial Intelligence Companies to Manage the Risks Posed by AI". One of that document's many bullet points:

  • The companies commit to developing robust technical mechanisms to ensure that users know when content is AI generated, such as a watermarking system. This action enables creativity with AI to flourish but reduces the dangers of fraud and deception.

Read the rest of this entry »

Comments (10)