Language Log

Archive for Language and science

Dallas Dodecahedron Daze Days

January 28, 2026 @ 11:39 am· Filed by Victor Mair under Language and art, Language and astronomy, Language and culture, Language and mathematics, Language and philosophy, Language and science

I recently spent a week at my son's campground in the countryside outside Dallas. While there, I was elated to espy a sizable dodecahedron made of twelve substantial wooden panels tightly wrapped in brown, buff leather. It had been constructed by a local artist about a dozen years ago.

Contemplating that cosmic shape, it brought back all those vibrant discussions of geometry, linguistics, and metaphysics from a year and a half ago. Esthetically and intellectually satisfying to commune with my old friend the dodecahedron, I fell into a reverie beneath those shaggy-scraggly-barked eastern red cedars that seemed to draw me up into their spreading branches that connected to the universe emanating from the dodecahedron that I held at my waist.

Read the rest of this entry »

Permalink Comments (8)

Oppenheimer, Einstein, the Atom bomb, Hiroshima, time, death, and the Bhagavad Gita

January 3, 2026 @ 9:11 pm· Filed by Victor Mair under Announcements, Language and religion, Language and science

Sino-Platonic Papers is pleased to announce the publication of its three-hundred-and-seventy-fifth issue:

“How Oppenheimer Mistook Time for Death at Trinity (the A-bomb Test Site) and How the Bhagavad Gītā, Read Properly, Resonates with the Block Universe of Einstein,” by Conal Boyce.

Read the rest of this entry »

Permalink Comments (2)

Volts before Volta

January 3, 2026 @ 7:06 pm· Filed by Victor Mair under Announcements, Language and archeology, Language and science

Sino-Platonic Papers is pleased to announce the publication of its three-hundred-and-seventy-seventh issue:

“The Baghdad Battery: Experimental Verification of a 2,000-Year-Old Device Capable of Driving Visible and Useful Electrochemical Reactions at over 1.4 Volts,” by Alexander Bazes.

Read the rest of this entry »

Permalink Comments (13)

Haboob, part 2

August 27, 2025 @ 1:53 pm· Filed by Victor Mair under Borrowing, Etymology, Language and science

This word caught my attention on the news this morning. It was said to be a gigantic dust/sandstorm that was passing through the central Arizona area. As soon as I heard the sound of the word, with a probable triliteral Semitic root and the fact that it was some sort of sandstorm, I thought that it was most likely Arabic. And indeed it is.

Read the rest of this entry »

Permalink Comments (21)

Visualizing linguistic data

August 16, 2025 @ 8:48 am· Filed by Mark Liberman under Language and science

I was recently invited to be interviewed on the topic of "visualizing linguistic data". As I understand it, the point was not to describe the standard stuff, like trees, dependency graphs, logical formulae, or waveforms, spectrograms, spectral slices, formant tracks, F0 tracks, and so on. Rather, the idea was to describe less common kinds of visualizations, or at least somewhat novel ways of using the standard visualizations.

I've done a lot of "visualizing linguistic data" over the years. And talking about these explorations strikes me as problematic, partly because the whole point of visualization is to go beyond talk, and partly because I've used lots of kinds of graphs and tables to explore lots of different questions at different levels of analysis, and it's hard to know where to start and where to stop.

So I've started by makiing a linked list of relevant LLOG posts, mostly on the phonetic side of things. There are a lot of them, and I'm sure I've left some out. I doubt that any readers will want to do more than click on a few at random — my goal was mainly to give myself (and maybe the interviewer) some background for a possible discussion.

I've listed the posts in chronological order, rather than by topic. FWIW, here they are:

Read the rest of this entry »

Permalink Comments (6)

Coined Chinese characters: The 24 solar terms, part 4

August 8, 2025 @ 6:41 am· Filed by Victor Mair under Language and astronomy, Language and mathematics, Language and science

As is usually the case, the Wikipedia article on the Chinese calendar is comprehensive and built on consensus. It states:

The Chinese calendar, as the name suggests, is a lunisolar calendar created by or commonly used by the Chinese people. While this description is generally accurate, it does not provide a definitive or complete answer. A total of 102 calendars have been officially recorded in classical historical texts. In addition, many more calendars were created privately, with others being built by people who adapted Chinese cultural practices, such as the Koreans, Japanese, Vietnamese, and many others, over the course of a long history.

A Chinese calendar consists of twelve months, each aligned with the phases of the moon, along with an intercalary month inserted as needed to keep the calendar in sync with the seasons. It also features twenty-four solar terms, which track the position of the sun and are closely related to climate patterns. Among these, the winter solstice is the most significant reference point and must occur in the eleventh month of the year. Each month contains either twenty-nine or thirty days. The sexagenary cycle for each day runs continuously over thousands of years and serves as a determining factor to pinpoint a specific day amidst the many variations in the calendar. In addition, there are many other cycles attached to the calendar that determine the appropriateness of particular days, guiding decisions on what is considered auspicious or inauspicious for different types of activities.

Read the rest of this entry »

Permalink Comments off

Coined Chinese characters: The 24 solar terms, part 3

August 7, 2025 @ 7:03 am· Filed by Victor Mair under Language and astronomy, Language and mathematics, Language and science

The chapter on "Calendar and Chronology" in Brill's Encyclopedia of China Online (2009) was authored by Ho Peng Yoke (1926-2014), who was the Director of the Needham Research Institute from 1990-2001. The first two paragraphs of Ho's chapter begin as follows:

The traditional Chinese calendar is lunisolar, i. e. it is based on both the movement of the moon and on what seems to be the orbit of the sun around the earth. The incommensurability of the lunar synodical period of 29.530587… days and the equinoctial year's 365.2421… days has always been the cause for numerous difficulties with respect to the establishment of a calendar in China. In order to replace the former calendars which after a time had lost their validity, roughly 100 different types of calendars were devised over a period of about 2000 years, many of which were never officially adopted. According to Joseph Needham, the history of calendar making is the consequence of attempts to "make the incompatible compatible."

Read the rest of this entry »

Permalink Comments (3)

Coined Chinese characters: The 24 solar terms, part 2

August 6, 2025 @ 10:47 am· Filed by Victor Mair under Language and astronomy, Language and mathematics, Language and science

The calendrical system used for defining the dates of traditional Chinese festivals such as ‘Chinese New Year’ (The first day of the first lunar month, now called 'Spring Festival’ Chun jie 春節 in the PRC), the mid-autumn festival 中秋 (full moon of the 8th lunar month) and so on is the last of the many versions of the Chinese luni-solar calendar that were adopted by successive imperial governments until the fall of the empire in 1911. It is in fact the system adopted by the Qing dynasty in 1644.

Christopher Cullen’s book Heavenly Numbers: Astronomy and Authority in Early Imperial China (Oxford, 2017) gives a detailed account of the successive systems of mathematical astronomy that were used by Chinese astronomical officials in early imperial times to produce the annual luni-solar calendars that were promulgated by imperial authority. The following explanations are taken from Cullen’s book.

Read the rest of this entry »

Permalink Comments (4)

A taste of Old Uyghur

August 4, 2025 @ 8:55 am· Filed by Victor Mair under Epigraphy, Language and art, Language and science

It's not every day that you get a chance to experience Old Uyghur (language; script). Recently, when I was looking through albums of photographs of medieval Buddhist wall-paintings, I spotted an Old Uyghur inscription:

(click to embiggen)

That's from a Five Dynasties (907-979) transformation tableau (biànxiàng 變相) depicting the story of the Buddhist saint, Mahāmaudgalyāyana, rescuing his mother from hell. It is located on the north side of the corridor to the antechamber to Cave 19 at Yulin Grottoes toward the western extremity of Gansu Province, part of the larger complex of medieval Buddhist caves at Dunhuang, which we have often mentioned on Language Log. For the complete and heavily annotated translation of the transformation text on Mahāmaudgalyāyana (Mùlián 目連), see Victor H. Mair, Tun-huang Popular Narratives (Cambridge [Cambridgeshire] ; New York: Cambridge University Press, 1983).

Read the rest of this entry »

Permalink Comments (6)

TREC: 1992-2025 and onwards

July 7, 2025 @ 7:01 pm· Filed by Mark Liberman under Language and science

The 11 tracks of TREC2025 are underway, collectively constituting the 2025 edition of the "Text Retrieval Conference" organized by the National Institute of Standards and Technology. See the call for details and links, and this site for a few words about its history going back to 1992.

Wikipedia has more historical information, although the article's section on "Current tracks" is from 2018, which is not exactly "current".

And the Wikipedia article also doesn't give a clear picture of what TREC accomplished in its early years. Here's what it says about TREC-1:

In 1992 TREC-1 was held at NIST. The first conference attracted 28 groups of researchers from academia and industry. It demonstrated a wide range of different approaches to the retrieval of text from large document collections. Finally TREC1 revealed the facts that automatic construction of queries from natural language query statements seems to work. Techniques based on natural language processing were no better no worse than those based on vector or probabilistic approach.

Read the rest of this entry »

Permalink Comments (1)

Physics and linguistics notes on the formation of the vocabulary for quantum theory

March 25, 2025 @ 8:50 am· Filed by Victor Mair under Dictionaries, Language and science, Lexicon and lexicography, Translation, Vocabulary

[This is a guest post by Conal Boyce]

Exactly what had become ‘visualizable’ according to Heisenberg in 1927,
and whence the term ‘Blurriness Relation’ in lieu of Uncertainty Principle?

As backdrop for the physics concepts and associated German vocabulary to be explored in a moment, here is a story I call “Quadrille Dance & Shotgun Wedding”:

1925. Heeding the lesson of Niels Bohr’s ill‑fated orbital theory (1913‑1918), Heisenberg is wary of developing any visual model; he wants to “get rid of the waves in any form.” Accordingly, with Max Born and Pascual Jordan, he sets forth his matrix‑mechanics formulation of quantum theory.

Read the rest of this entry »

Permalink Comments (14)

The Power of Naming

February 4, 2025 @ 11:00 am· Filed by Victor Mair under Classification, Information technology, Language and music, Language and science, Languages, Names, Topolects

[This is a guest post by Conal Boyce]

Overview: Here we look at some technical terms and how they’ve fared since their release to, or adoption by, the public: information theory; (TW) the colored quarks of Nambu and Han; cosmic‑ray decay according to Millikan; the Sinitic languages (Mair) vs. ‘the Chinese language’ (misnomer); Wu’s cosmic chirality as the violation of a nonNoetherian principle.

① information theory is the mother of all factoids. Why would one call it that? Because there is no such thing, only the following phantom utterance that is ubiquitous: “Shannon’s information theory.” In 1948, Shannon wrote a paper on the mathematics of data‑communication technology, and named it accordingly. Put off by its name, science journalists introduced it to the world as “information theory.” The name stuck, suggesting in the minds of innocents something so deep and epochal that it might even shed light on Mozart. Shannon 1948 is the big example of how of data and information have been confounded for 3/4 of a century, but it is accompanied by innumerable smaller cases, as when Susskind argues that “in physics we treat them as pretty much the same thing” (paraphrase; details in Appendix A). Here is a rough‑and‑ready demonstration of how different they actually are: “Go.” ←That’s just data, but place it in a context, and a layer of information now “rides on it” (or floats above it, on a different plane) such that this is conveyed: “Go to the store now before it closes”; or this: “Fly now to Hiroshima and drop the bomb.” True, in shop‑talk and hallway conversations, a database developer or data‑comm engineer might toss the terms data and information around as if one believed them to be interchangeable. Then, overheard by someone in the world at large, such casual usage is easily misconstrued, leading astrophysicists to fret in public over the “information” that might be “lost” in a black hole. (As for an actual Theory of Information, we must wait for a superintelligent computer to produce it since that task is far beyond human ability. And once coughed up, it will be so lengthy as to require several lifetimes to read it, and in any case, largely incomprehensible to us.)

Read the rest of this entry »

Permalink Comments (33)

"Neutrino Evidence Revisited (AI Debates)" | Is Mozart's K297b authentic?

November 13, 2024 @ 6:24 pm· Filed by Victor Mair under Artificial intelligence, Language and music, Language and science

[This is a guest post by Conal Boyce]

Recently I watched a video posted by Alexander Unzicker, a no-nonsense physicist who often criticizes Big Science (along the same lines as Sabine Hossenfelder — my hero). But in this case (link below) I was surprised to see Unzicker play back a conversation between himself and ChatGPT, on the subject of the original discovery of neutrinos — where the onslaught of background noise demands very strict screening procedures and care not to show "confirmation bias" (because one wants so badly to be the first one to actually detect a neutrino, thirty years after Pauli predicted them). It is a LONG conversation, between Unzicker and ChatGPT, perfectly coherent and informative, one that I found very pleasant to listen to (he uses the audio option: female voice interleaved with his voice).

[VHM note: This conversation between Unzicker and GPT is absolutely astonishing. Despite the dense technicality of the subject, GPT understands well what he is saying and replies accordingly and naturally.]

Read the rest of this entry »

Permalink Comments (51)

« Previous Entries

Archive for Language and science

Dallas Dodecahedron Daze Days

Oppenheimer, Einstein, the Atom bomb, Hiroshima, time, death, and the Bhagavad Gita

Volts before Volta

Haboob, part 2

Visualizing linguistic data

Coined Chinese characters: The 24 solar terms, part 4

Coined Chinese characters: The 24 solar terms, part 3

Coined Chinese characters: The 24 solar terms, part 2

A taste of Old Uyghur

TREC: 1992-2025 and onwards

Physics and linguistics notes on the formation of the vocabulary for quantum theory

The Power of Naming

"Neutrino Evidence Revisited (AI Debates)" | Is Mozart's K297b authentic?

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta