Names of the chemical elements in Chinese
« previous post | next post »
Mike Pope relayed to me the following from his son Zack, a high school physics teacher:
I was wondering what the periodic table of elements looked like in China, and found this image.
This may or may not be the "official" periodic table, but I thought it was interesting to see the similarities in the characters. Specifically the character for gold, which is also the character for metal in general, and is a prefix for a large portion of the periodic table. The character for water is a large part of the character for mercury, and a few others, and all of the gas elements have the same character in them. It makes me wonder what the protocol is for naming new elements in Chinese, since they seem to be focused on the properties of the element itself, and that would take more investigating than might be possible for new elements, which usually only exist for fractions of fractions of seconds. Newly discovered elements these days are named (in English) after people: Bohrium, Rutherfordium, Fermium, Einstenium, etc. and I wonder what the Chinese equivalent of those elements is.
Zack has raised many good questions.
The first thing we may say about the names of the chemical elements in Chinese is that every single one of them is monosyllabic. This actually causes great problems for Chinese chemists and other scientists, as well as the lay public, since there are so many homophones and near-homophones among them and with other monosyllabic words not on the list. Listening to a lecture or holding discussions that mention chemical elements and hearing the elements referred to by these monosyllabic names is challenging, to say the least. They just don't stand out the way, say, "chlorine" and "hydrogen" do.
The vast majority of the Chinese characters for the elements contain the "gold / metal" radical 金. Next in number are characters that contain the "gas / vapor" radical 气. After that comes a smaller group of characters containing the "stone / rock" radical 石. Last, there are two characters that contain the water radical 氵/ 水: xiù 溴 ("bromine") and gǒng 汞 ("mercury"). In terms of the classification of the elements by state (solid, liquid, gas, unknown) and type (metals [alkali metals, alkaline earth metals, lanthanoids, actinoids, transition metals, post-transition metals], nonmetals [halogens, noble gases, other nonmentals]), and metalloids, the division (according to character radicals) into metal, gas, stone, and water is not accurate.
Only a few of the characters for the elements existed in premodern times (e.g., those for "silver", "copper", "iron", "tin", "gold", "lead", "mercury", "carbon", "boron", and "sulfur"). Most of the characters for elements that were isolated during the Industrial Age or discovered more recently have had to be invented from scratch to transcribe the sound of the initial part of the name of the element in Western languages. These characters serve no other purpose than to designate the elements in question, and a number of them do not exist in electronic fonts. Unicode strives to add these newly created characters to the higher levels of its latest versions, but there is always naturally going to be a time lag between the creation of new characters and the time they are actually implemented in Unicode. In addition, as more and more new elements are being discovered, chemists in China, Taiwan, and elsewhere have not yet devised any character for several of them. And that brings up the matter of multiple characters for the same elements and multiple readings for the same characters in Taiwan and China (see the list below).
After receiving Mike's message, I set about doing the necessary research to answer Zack's questions. I was both surprised and disappointed by how hard it was to find a simple numerical list giving the following information for each element: number, symbol, English name, Chinese character (traditional and simplified), Pinyin. Various Chinese versions of the periodic chart of elements were not hard to locate, but they were all unsatisfying in one way or another (not well organized, not very legible, incomplete, etc.). In the end, several colleagues helped me devise our own list, which, for now, can only be found here on Language Log. So far as I can tell, it is more comprehensive and up-to-date than any list of Chinese names for the chemical elements that is available anywhere.
1 H Hydrogen 氫:氢 qīng
2 He Helium 氦:氦 hài
3 Li Lithium 鋰:锂 lǐ
4 Be Beryllium 鈹:铍 pí
5 B Boron 硼:硼 péng
6 C Carbon 碳:碳 tàn
7 N Nitrogen 氮:氮 dàn
8 O Oxygen 氧:氧 yǎng
9 F Fluorine 氟:氟 fú
10 Ne Neon 氖:氖 nǎi
11 Na Sodium 鈉:钠 nà
12 Mg Magnesium 鎂:镁 měi
13 Al Aluminum 鋁:铝 lǚ
14 Si Silicon 硅:硅 guī (PRC); 矽:矽 xì (Tw) (PRC pron. xī)
15 P Phosphorus 磷:磷 lín
16 S Sulfur 硫:硫 liú
17 Cl Chlorine 氯:氯 lǜ
18 Ar Argon 氬:氩 yà
19 K Potassium 鉀:钾 jiǎ
20 Ca Calcium 鈣:钙 gài
21 Sc Scandium 鈧:钪 kàng
22 Ti Titanium 鈦:钛 tài
23 V Vanadium 釩:钒 fán
24 Cr Chromium 鉻:铬 gè
25 Mn Manganese 錳:锰 měng
26 Fe Iron 鐵:铁 tiě
27 Co Cobalt 鈷:钴 gǔ (PRC); gū (Tw)
28 Ni Nickel 鎳:镍 niè
29 Cu Copper 銅:铜 tóng
30 Zn Zinc 鋅:锌 xīn
31 Ga Gallium 鎵:镓 jiā
32 Ge Germanium 鍺:锗 zhě
33 As Arsenic 砷:砷 shēn
34 Se Selenium 硒:硒 xī
35 Br Bromine 溴:溴 xiù
36 Kr Krypton 氪:氪 kè
37 Rb Rubidium 銣:铷 rú
38 Sr Strontium 鍶:锶 sī
39 Y Yttrium 釔:钇 yǐ
40 Zr Zirconium 鋯:锆 gào
41 Nb Niobium 鈮:铌 ní
42 Mo Molybdenum 鉬:钼 mù
43 Tc Technetium 鍀:锝 dé (PRC); 鎝:钅+荅 tǎ (Tw)
44 Ru Ruthenium 釕:钌 liǎo
45 Rh Rhodium 銠:铑 lǎo
46 Pd Palladium 鈀:钯 bǎ (PRC); bā (Tw)
47 Ag Silver 銀:银 yín
48 Cd Cadmium 鎘:镉 gé
49 In Indium 銦:铟 yīn
50 Sn Tin 錫:锡 xī (PRC); xí (Tw)
51 Sb Antimony 銻:锑 tī (PRC); tì (Tw)
52 Te Tellurium 碲:碲 dì
53 I Iodine 碘:碘 diǎn
54 Xe Xenon 氙:氙 xiān
55 Cs Cesium 銫:铯 sè
56 Ba Barium 鋇:钡 bèi
57 La Lanthanum 鑭:镧 lán
58 Ce Cerium 鈰:铈 shì
59 Pr Praseodymium 鐠:镨 pǔ
60 Nd Neodymium 釹:钕 nǚ
61 Pm Promethium 鉕:钷 pǒ
62 Sm Samarium 釤:钐 shān
63 Eu Europium 銪:铕 yǒu
64 Gd Gadolinium 釓:钆 gá
65 Tb Terbium 鋱:铽 tè
66 Dy Dysprosium 鏑:镝 dī
67 Ho Holmium 鈥:钬 huǒ
68 Er Erbium 鉺:铒 ěr
69 Tm Thulium 銩:铥 diū
70 Yb Ytterbium 鐿:镱 yì
71 Lu Lutetium 鑥:镥 lǔ (PRC); 鎦:镏 liú (Tw)
72 Hf Hafnium 鉿:铪 hā
73 Ta Tantalum 鉭:钽 tǎn
74 W Tungsten 鎢:钨 wū
75 Re Rhenium 錸:铼 lái
76 Os Osmium 鋨:锇 é
77 Ir Iridium 銥:铱 yī
78 Pt Platinum 鉑:铂 bó
79 Au Gold 金:金 jīn
80 Hg Mercury 汞:汞 gǒng
81 Tl Thallium 鉈:铊 tā
82 Pb Lead 鉛:铅 qiān
83 Bi Bismuth 鉍:铋 bì
84 Po Polonium 釙:钋 pō (PRC); pò (Tw)
85 At Astatine 砹:砹 ài (PRC); 砈 è (Tw)
86 Rn Radon 氡:氡 dōng
87 Fr Francium 鈁:钫 fāng (PRC); 鍅fǎ (Tw)
88 Ra Radium 鐳:镭 léi
89 Ac Actinium 錒:锕 ā
90 Th Thorium 釷:钍 tǔ
91 Pa Protactinium 鏷:镤 pú
92 U Uranium 鈾:铀 yóu (PRC); yòu (Tw)
93 Np Neptunium 鎿:镎 ná (PRC); 錼 nài (Tw)
94 Pu Plutonium 鈈:钚 bù (PRC); 鈽:钸 bù (Tw)
95 Am Americium 鋂:镅 méi
96 Cm Curium 鋦:锔 jú
97 Bk Berkelium 錇:锫 péi (PRC); 鉳 běi (Tw)
98 Cf Californium 鐦:锎 kāi (PRC); 鉲:钅+卡 kǎ (Tw)
99 Es Einsteinium 鎄:锿 āi (PRC); 鑀 ài (Tw)
100 Fm Fermium 鐨:镄 fèi
101 Md Mendelevium 鍆:钔 mén
102 No Nobelium 鍩:锘 nuò
103 Lr Lawrencium 鐒:铹 láo
104 Rf Rutherfordium 鑪:钅+卢 lú
105 Db Dubnium U+289C0:钅+杜 dù
106 Sg Seaborgium U+28B4E:钅+喜 xǐ
107 Bh Bohrium U+28A0F:钅+波 bō (PRC); pō (Tw)
108 Hs Hassium U+28B46:钅+黑 hēi
109 Mt Meitnerium 䥑 U+4951:钅+麦 mài
110 Ds Darmstadtium 鐽: dá
111 Rg Roentgenium 錀:钅+仑 lún
112 Cn Copernicium 鎶 gē
113 Uut Ununtrium ?
114 Fl Flerovium 鈇: fū
115 Uup Ununpentium ?
116 Lv Livermorium 鉝: lì
117 Uus Ununseptium ?
118 Uuo Ununoctium Eka氡:Eka氡 or 118號元素:118号元素
[Note: the missing simplified forms for 113, 115, and 117 would appear in utf-8 format, but I am not able to process them on my computers and post them to Language Log.]
Addendum: This, from Ariel Herman, has been in my drafts folder since 10/27/10:
Tom Lehrer's elements song (1959).
And here's the elements song in Japanese. It's all in katakana gairaigo.
[Thanks to Rich Warmington, Mark Swofford, Richard Cook, and Silas Brown]
[Update from Apollo Wu: About the Chinese chemical names, I think they created a high learning barrier to anyone who wants to study chemistry in Chinese. As a high school student, I much preferred studying physics rather than chemistry, because I didn't have to confront all these strange Chinese characters. A similar reason for not majoring in chemistry may very well explain why Chinese chemical and pharmaceutical industries are still backward even today. Such a situation is reflected in poor product quality.]
Stephan Stiller said,
May 3, 2015 @ 11:20 pm
1. Yet another example of how Chinese characters (with the idea that they each mean or must mean something on their own) create – rather than solve – problems.
2. I am very glad to find a clear exposition of the matter that doesn't glorify the Chinese way to spell the chemical elements. One common falsehood propagated by defenders of Chinese characters as a writing system is:
Well, every time I hear nonsense like that I want to scream. Because clearly Chinese nomenclature for chemical elements is nowhere near as systematic as some make it out to be, and (as was pointed out above), most are metals anyways, and the rest ("it's a gas, well – duh") isn't very illuminating, as in: doesn't exactly reveal many bits of information. Add to that that the phonetic components aren't all that helpful either (eg 鉭/tǎn has the phonetic component 旦/dàn; to find the exact pronunciation tǎn one has to consider certain characters containing 旦 such as 坦), meaning natives do get it wrong, a lot. And even if it worked, the value of knowing something (and in fact not very much) about the periodic table of elements (or that something is a species of tree or fish) isn't exactly large and hence doesn't justify that writing system. (And, how often are you in a context where knowing that something is a metal or gas or tree or fish – again, not a lot of information – is really crucial to comprehending something? I mean, in a context that doesn't already make it clear that you're talking about a chemical element or tree or fish?) Not that Latin scientific nomenclature is any more lucid, to be fair.
Yao said,
May 4, 2015 @ 1:14 am
This periodic table at the back of my middle/high school chemistry books looked very different from the one linked, in which the short-handed spelling and the order number were prominently displayed. For all practical purposes it was never required to remember the names of any but the two dozen or so most common elements. The Chinese names were just there to approximate the pronunciation of the Latin names, and people seem to mostly just ignore them in research.
E-Ping Rau said,
May 4, 2015 @ 1:47 am
From my education I pronounce Beryllium and Argon as "pǐ" and "yǎ" respectively, other than that the list seems fine (at least the Taiwan part of it); I actually have a hard time myself remembering the right pronunciation for many of the metal elements because they are so rarely encountered (especially the rare earth metals).
The names of fishes are definitely another can of worms. Like the chemical elements, traditionally many characters exist solely for that purpose, and the "fish" radical has many strokes which makes most such characters cumbersome to write. Not to mention nowadays we almost always add the morepheme "fish" (魚 yú) to the names of fishes to avoid confusion, thus making these characters super redundant (鯛魚、鰹魚、鰈魚、鮪魚、鯖魚). I think maybe that's what we ought to do with the chemical elements too – use one or several syllables to represent the sound, and then add the morpheme "element" (素 sù) to the name of the elements like what the Japanese did with many of the more common elements.
One *slight* advantage of monosyllabic names for chemical elements is that rows or columns of elements have a better rhythm for us to memorize : 氫鋰鈉鉀銣銫鍅 or 氟氯溴碘砈 sound like just another traditional poem.
Eli Nelson said,
May 4, 2015 @ 1:56 am
@Yao:
That's fascinating. So it's almost like the Japanese situation, except the official spelling is different. When you say "the short-hand spelling", I assume you're talking about the Latin-alphabet element symbol used in chemical formulas and the like?
Richard W said,
May 4, 2015 @ 3:15 am
It might be interesting to add a column of information about the origin of the Chinese name of each element.
Some of the element names such as gǒng for mercury are, I presume, of Chinese origin.
In many other cases, the Chinese name was chosen to sound like the start of the English (or at least, non-Chinese) name. For example, I suppose the name lǐ comes from "lithium".
Some are not so obvious. "Tungsten", for example, is wū in Chinese, and that is presumably from the name "wolfram", which is used in some European countries instead of "tungsten".
I suppose the name dōng for radon comes from the second syllable of "radon" (presumably because léi was already chosen for radium).
"Francium was discovered by Marguerite Perey in France (from which the element takes its name) in 1939", says Wikipedia. France is Fǎguó in Chinese, so the element's Chinese names (fāng in the PRC and fǎ in Taiwan) apparently come from the Chinese word for the country in which the element was discovered (by analogy with the origin of the English name).
By the way, I see that in Taiwan, both sulfur (16) and lutetium (71) are called liú. Can anyone spot other homonyms?
Simon P said,
May 4, 2015 @ 5:45 am
Interestingly, then English name "tungsten" comes from Swedish (heavy stone), but the element is actually called "volfram" in Swedish (which is a more awesome name, anyway, since it means "wolf froth"), from German. "Tungsten" is only used for the mineral scheelite in Swedish.
Michael Watts said,
May 4, 2015 @ 6:04 am
There's nothing necessary about this. Unicode already supports making glyphs out of several combined code points, so a lowercase e with acute accent might be unicode point U+00c9 ("latin small letter e with acute", or it might be the two unicode points U+0065,U+0301("latin small letter e" followed by "combining acute accent"). You could make an f-with-acute-accent as the sequence U+0066,U+0301. There's no reason not to assemble chinese characters the same way. In fact, I thought this was also already supported; I remember a comment on the duang post that represented the new symbol (成 above 龙, if I remember right) as what my browser showed as three glyphs: a box with a horizontal dividing line, a 成, and a 龙.
VHM has even posted a custom character to LL himself, the 茶 surrounded by a box (like 困, but with 茶 in the center instead of 木). What one piece of software already does, more software can do in the future.
J. Goard said,
May 4, 2015 @ 6:35 am
Wow, a lot of interesting differences from Korean:
The lighter elements generally use 소 (so;素) 'origin, nature' => 'element'. A couple are similar: 봉소 (bong-so) 'boron', from 봉사 (bong-sa) '~sand = borax'; 탄소 (tan-so) 'carbon'. Several element names are Indo-European borrowings, apparently German: 리튬 (rityum) 'lithium', 나트륨 (nateuryum) 'sodium'. But several others use come from interestingly different Chinese characters: 산소 (san-so) 'oxygen' uses the character 酸 'sour, acid'; 질소 (jil-so) 'nitrogen' uses the character 窒 ('blocking, stopping') (which I hope someone can explain to me); 수소 (su-so) 'hydrogen' straightforwardly uses 水 'water'; 염소 (yeom-so) 'chlorine' uses 鹽 'salt'.
The most interesting to me is 주석 (ju-seok) 'tin', which is 柱石 'pillar-stone'. Knowing nothing about ancient construction methods, that doesn't strike me as a very good fit of metal to function! :-)
shubert said,
May 4, 2015 @ 6:59 am
You are right, some …by TW is better.
Ray Girvan said,
May 4, 2015 @ 7:11 am
A chance to clear a few bookmarks: the Wikipedia page Chemical elements in East Asian languages has links to three very interesting papers on this topic.
* The Chinese Periodic Table: A Rosetta Stone for Understanding the Language of Chemistry in the Context of the Introduction of Modern Chemistry into China.
* A New Inquiry into the Translation of Chemical Terms by John Fryer and Xu Shou.
* Chinese Terms for Chemical Elements.
Nickolas said,
May 4, 2015 @ 7:18 am
Reading this post and the comments leads me to the inevitable end line… Why was it so hard to find/ create this multilingual list of the fundamental building blocks of the Universe and would someone please create a "Rosetta Stone" spreadsheet of the periodic table in (at least) the top ten to fifteen world languages and highlight the differences. Aren't we heading toward a universal world language within the next hundred years or so? There should be a Universal periodic table.
Victor Mair said,
May 4, 2015 @ 7:19 am
@Michael Watts
"There's nothing necessary about this…. VHM has even posted a custom character to LL himself…."
Yes, I did post that custom character (and probably have posted a few others over the years), but it was indeed a custom character. That means I had to go to special lengths to have it created, and it is not reproducible outside of the context where I presented it. I suppose that people could copy it as a sort of picture, which sometimes happens with weird characters (by "weird" I mean that they do not exist in any electronic fonts, including the largest ones). The fact that you had to describe my custom character as "茶 surrounded by a box (like 困, but with 茶 in the center instead of 木)" instead of just typing it shows that there is indeed a lag time between the creation of a new character and the time it gets included in electronic fonts, if ever.
Such a situation is inconvenient for a science like chemistry, where new characters are occasionally needed. And, as I have pointed out in various Language Log posts, new characters continually pop up in many other areas of culture and science.
I hope that Richard Cook, a top researcher at Wenlin Institute and an important consultant on Chinese characters at the Unicode Consortium, will comment on how the creation and implementation of new and rare characters, such as those for recently discovered chemical elements, are handled at Wenlin and Unicode.
http://wenlin.com/
http://unicode.org/
Michael Watts said,
May 4, 2015 @ 7:34 am
Using 酸 for oxygen doesn't seem particularly surprising to me, compare the english word ('from Greek oxys "sharp, acid"') or the german word (sauerstoff "sour stuff"). If the Korean names of lithium and sodium come from German, it's pretty straightforward for the Korean names for oxygen and hydrogen to come from German too.
I don't see any etymological justification for naming chlorine after salt, but it's pretty easy to explain from first principles.
Michael Watts said,
May 4, 2015 @ 7:37 am
Victor Mair:
You completely missed my point about combining characters. You can type an f-with-acute-accent in unicode despite the fact that it's not specifically supported by a font. LaTeX will do the same. In order to type a character, it is not necessary for that character to be included in an electronic font.
Victor Mair said,
May 4, 2015 @ 7:50 am
@Michael Watts
I didn't miss your point. Being able to type characters the way you describe is quite a different kettle of fish, as it were, from being able to transmit them freely through electronic media, which is the issue I originally raised, but which you said is not "necessary".
Jerry Friedman said,
May 4, 2015 @ 8:44 am
J. Goard: "Die deutsche Bezeichnung Stickstoff erinnert daran, dass molekularer Stickstoff Flammen löscht („erstickt“) oder dass in reinem Stickstoff Lebewesen ersticken."
From German Wikipedia article on nitrogen.
Google translate, slightly edited: "The German term Stickstoff recalled that molecular nitrogen extinguished ("choked") flame or that creatures suffocate in pure nitrogen."
Somebody else will have to help you with just plain Stick. Maybe it's connected to stecken, 'to plug'.
Simon said,
May 4, 2015 @ 9:05 am
Stickstoff = "choke/suffocate material".
flow said,
May 4, 2015 @ 9:05 am
@Michael Watts a truly working 漢字生成器—all the more if it came with as broad a level of adoption as does Unicode itself—would certainly be a very desirable piece of software. Unfortunately, no one has succeeded in accomplishing this so far. All the efforts that produced papers which I know of showed off somehow promising, but also (typically) rather awkward character shapes (and typically, too, you do not often get to see follow-up reports).
As far as character generation goes, even Korean Hangeul are still mostly used as precomposed units, although the problems there should be orders of magnitude easier to solve (given there are just dozens of shapes to combine, not many hundreds, and that their internal structure is also graphically rather simpler, *plus* in recent years a lot word processors and webbrowsers have started to show series of jamo as (precomposed!) Hangeul syllables). Everything's in place but the dynamic production of the displayed shapes.
There were a few papers produced by people at the Academia Sinica in Taiwan in the late 80s and early 90s; one of the envisioned use cases was to dynamically produce subtitles using a "set-top box" (a small custom purpose computer that goes between the TV set and the VCR). My guess is that this stalled because of advances in character encoding (BIG5 in this case) and because computing power and memory became cheaper by the year, so pretty soon very decent character shapes could be stored in ROM and displayed on-screen.
If you think you can solve the technical and aesthetic challenges involved in linking the left and right hand sides of equations such as "鐳=⿰金雷" or "鐳=⿰金⿱雨田", go ahead and try!
Roger Lustig said,
May 4, 2015 @ 9:08 am
The German term for HCl is "Salzsäure"–"Salt acid."
flow said,
May 4, 2015 @ 9:22 am
@VHM
The simplified form of #110 Ds Darmstadtium 鐽 is U+2B7FC, ⿰钅达.
The simplified form of #116 Lv Livermorium 鉝: lì is U+2B7F7, ⿰钅立.
The simplified form of #114 Fl Flerovium 鈇: fū is U+2B4e7, ⿰钅夫.
"The simplified characters for meitnerium (Mt) and copernicium (Cn) are not encoded as of Unicode 7.0 (June 2014)"—http://en.wikipedia.org/wiki/Chemical_elements_in_East_Asian_languages
NB "simplified form" here relates only to "what shape one should reasonably expect knowing how simplified characters are typically formed from their traditional counterparts", not necessarily "this is accepted usage in the PRC".
Emily said,
May 4, 2015 @ 10:29 am
Poul Anderson's "Uncleftish Beholding" might be of interest here (for anyone who hasn't already read it):
https://groups.google.com/forum/message/raw?msg=alt.language.artificial/ZL4e3fD7eW0/_7p8bKwLJWkJ
In a similar vein, Douglas Hofstadter mentions Chinese names for subatomic particles in Le ton beau de Marot. These include a character meaning "seed" (among other things, but in this context it's best translated as "seed"), and for example proton is literally "first seed." I can't find any info about these words online, though.
In the same book Hofstadter also mentions Chinese names for dinosaurs which are more or less calques of their scientific names combined with "dragon"– e.g. a Triceratops becomes a "three-horn dragon." (And more recent dinosaur discoveries from China actually have Linnaean names that are partially or entirely Chinese– such as the wonderful Yi qi 'strange wing', which looks like a velociraptor had babies with a bat.)
a Chinese speaker said,
May 4, 2015 @ 11:06 am
「So far as I can tell, it is more comprehensive and up-to-date than any list of Chinese names for the chemical elements that is available anywhere.」
Can you clarify how/why you think your list is better than, say,
https://zh.wikipedia.org/wiki/%E5%8C%96%E5%AD%B8%E5%85%83%E7%B4%A0
?
It seems to me that yours is not as complete, or well formatted, or useful as the above one. I could easily find lots of such tables on Internet.
J. W. Brewer said,
May 4, 2015 @ 11:28 am
Interesting to learn from Ray Girvan's link that some of the Japanese element names that are neither pre-modern nor phonetic transliterations of a/the Western name are calques of the German names. It makes plenty of sense in terms of direct Meiji-era influence, but still interesting, not least because it's a reminder that chemistry was one of those sciences where at one point German was the default international scholarly language before English had displaced it.
Michael Watts said,
May 4, 2015 @ 11:33 am
Victor Mair:
Being able to represent characters as valid unicode necessarily implies being able to transmit them freely through electronic media. Unicode is well supported at this point. You said that there must always be a delay between the invention of a character and its being implemented in unicode; this is not true. There will always be a delay between the invention of a character and its being given a dedicated unicode code point, but there is no reason, in theory, why a new character would ever need a dedicated code point. Similarly, you can see the spurious accent I've added to "death and tax́es are life's only certainties" despite the fact that there is no unicode code point for small letter x with acute accent.
As flow points out, dynamically constructed chinese characters are not currently supported (except, I guess, by wenlin). http://www.unicode.org/charts/PDF/U2FF0.pdf specifically calls out that the "ideographic description characters" he's using "are visibly displayed graphic characters, not invisible composition controls". But that's a statement about the technology we have today, not a statement about the inherent limitations of working with 汉字. And in fact we already have the technology to construct characters dynamically today — wenlin already does it.
Ray Girvan said,
May 4, 2015 @ 11:43 am
If it's not digressing too much to other languages: although most Finnish element names just mirror the usual Latin-based ones, a few are nice 199th century coinages from its own Finno-Ugric roots, mirroring the property-based etymologies of English, German, etc.
* "happi" = oxygen: relating to "hapan" (sour) and "happo" (acid).
* "typpi" = nitrogen, relating to the dialect form "typehtyä" (to choke or smother).
* "vety" = hydrogen, relating to "vesi" / "vete" +"-y" (water-y)
* "pii" = silicon, from "piikivi" (flint).
Victor Mair said,
May 4, 2015 @ 12:13 pm
@Michael Watts
You're talking about theory and the future (which is always receding). I'm talking about reality and the present.
flow said,
May 4, 2015 @ 12:15 pm
@Michael Watts: do you have more precise info about dynamic characters in Wenlin?
Victor Mair said,
May 4, 2015 @ 12:21 pm
@a Chinese speaker
I saw that Wikipedia article and looked at about two dozen other sites. Ours is more complete (for example, we have full information for #118), we give variants for Taiwan and China (both for the characters and for the pronunciation), etc. In our deliberations, my colleagues have provided even more detailed information, which they may or may not add as comments.
For some recent additions, see this comment by flow: http://languagelog.ldc.upenn.edu/nll/?p=18877#comment-1494565
a Chinese speaker said,
May 4, 2015 @ 3:46 pm
The names of element 118 is no longer missing from that Wikipedia page when I last checked. Also, variants for Taiwan and China (both for the characters and for the pronunciation), etc. are NOT missing from that Wikipedia page. For example, the several variants mentioned in your example http://languagelog.ldc.upenn.edu/nll/?p=18877#comment-1494565 ALL appear in that Wikipedia page.
David Morris said,
May 4, 2015 @ 3:55 pm
Possibly related to this is the element names in Anglish: http://anglish.wikia.com/wiki/Fading_of_Ormotes
Michael Watts said,
May 4, 2015 @ 4:01 pm
flow:
I've never used wenlin. However, from what I can see here:
http://www.sinosplice.com/life/archives/2011/02/16/wenlin-4-0-review#cdl
http://guide.wenlininstitute.org/wenlin4.2/Character_Description_Language#Wenlin_Stroking_Box:_Advanced_CDL_Features
they appear to use their own extension of SVG to define characters.
David Marjanović said,
May 4, 2015 @ 4:10 pm
German Stickstoff, from ersticken "suffocate", where -en is the infinitive ending and er- indicates successful completion. Compare the adjective stickig, which is said about bad air in a closed room.
Stickstoff is itself modeled after French azote, which is from the Greek for "lifeless".
Stoff is "substance", "material"; sauer is "acidic" – applied to "sour milk", but never to "a sour odor".
As far as global communication among scientists is concerned, we're already there; I'll just drop the vague hint that there's a reason I'm writing this in English, which isn't my native language.
As far as speaking in daily life is concerned, we don't seem to be headed there at all… though I refuse to even speculate about what might happen in a hundred years.
David Marjanović said,
May 4, 2015 @ 4:14 pm
I can't find any pinyin or bopomofo on it. For that matter, there isn't any on the page for 鉝 Livermorium either; if I could read the whole page, I still wouldn't know how 鉝 is pronounced (not knowing how precise the phonetic part of the character is intended to be).
David Marjanović said,
May 4, 2015 @ 4:15 pm
(Stoff is also "cloth", interestingly enough.)
Jongseong Park said,
May 4, 2015 @ 5:27 pm
In case anyone is still wondering, most of the familiar element names in Korean follow the Japanese names, which in turn are heavily indebted to the German names.
The only purely native element name in Korean that I can think of is 구리 guri "copper". I guess 쇠 soe could have been used for iron, but then it is also a generic term for various metals so we go with the Sino-Korean 철 鐵 cheol instead. I thought 납 nap "lead" was a purely native word, and indeed the 표준국어대사전 Great Dictionary of Standard Korean shows no hanja for it, indicating that it is not Sino-Korean, but it suggests an etymology from 鑞 (랍 rap "solder; tin; platinum").
There is some serious confusion over the names of chemical elements in Korea right now because around ten years ago, the Korean Chemical Society (대한화학회) pushed through a very controversial overhaul of chemistry terms, abandoning the traditional names based on German forms for forms judged closer to American English forms. There are several reasons this was a terrible idea, but let me simply illustrate.
iodine 요오드 yoodeu (cf. German Jod) was changed to 아이오딘 aiodin
manganese 망간 manggan (cf. German Mangan) was changed to 망가니즈 mangganijeu
xenon 크세논 keusenon was changed to 제논 jenon
titanium 티타늄 titanyum was changed to 타이타늄 taitanyum
butane 부탄 butan was changed to 뷰테인 byutein
propane 프로판 peuropan was changed to 프로페인 peuropein
methane 메탄 metan was changed to 메테인 metein
Just one of the idiotic aspects about this is that the new forms don't even reflect English pronunciation very well but are instead a hodgepodge of spelling pronunciation with some English diphthongs thrown in. If we were to apply the standard Korean rules for transcribing English pronunciations, we would write 맹거니즈 maenggeonijeu for manganese, 지논 jinon for xenon, 타이테이니엄 taiteinieom for titanium, and 메세인 mesein for methane.
If the KCS had their way, they would also have changed 비타민 bitamin "vitamine" to 바이타민 baitamin and 비닐 binil "vinyl" was changed to 바이닐 bainil to approach American English pronunciation (British English is more likely to use a short i in the first syllable of "vitamine"—maybe a clue that relying on a language that itself has trouble making up its mind about what the correct pronunciations are isn't the best idea). They were thankfully not successful in that attempt.
Evidently, the KCS decided that not only their proposal was an improvement, but that it was absolutely necessary to change the standardized terms we had been using since 1987. Apparently the new terms were introduced in elementary textbooks starting in 2007. Understandably, chemistry teachers were hostile to this "reform" and other disciplines also were sticking to the old forms, but I haven't followed the story in more recent years. I personally will stick to the old forms.
Martin said,
May 4, 2015 @ 6:16 pm
@flow
I was interested in this and looked it up, and I think he might be referring to http://wenlin.com/cdl and http://guide.wenlininstitute.org/wenlin4.2/Character_Description_Language, but the relevant part of that is just a specialized font format language (i.e., sequence of stroke/component + position instructions). This doesn't abstract the shape of the character the way Unicode is supposed to (e.g. the double storey 'a' and 'g' are "the same" as the single storey 'a' and 'g').
Chau Wu said,
May 4, 2015 @ 7:13 pm
British English spelling of:
aluminum is aluminium, e.g., aluminium saucepans/baseball bats/foil;
sulfur is sulphur, although the Royal Society of Chemistry in the UK has adopted the spelling of sulfur, sulphur still remains the usual spelling in British, Irish, South African and Indian English.
By the way, pH meter is pronounced in Japanese as phee-ha-mitaa, reflecting its German origin.
Michael Watts said,
May 4, 2015 @ 7:49 pm
flow:
I've never used wenlin. However, from what I can see here:
http://www.sinosplice.com/life/archives/2011/02/16/wenlin-4-0-review#cdl
http://guide.wenlininstitute.org/wenlin4.2/Character_Description_Language#Wenlin_Stroking_Box:_Advanced_CDL_Features
they appear to use their own extension of SVG to define characters. That's not really workable as a transfer format (like unicode), but I don't see why a sequence like ⿰金⿱雨田 couldn't be interpreted into that format on the viewing end.
Conor Quinn said,
May 4, 2015 @ 9:34 pm
There's a nice periodic table on the back endsheet of the easily found 現代汉语词典 (商务印书馆). My copy dates from the late '90s, though, so it only goes up to what was then (in some places) called hahnium, now standardly dubnium. I also remember fun times looking through calligraphy guides that showed you how to do grass script versions of oxygen and fluorine, both of which are quite beautiful characters.
Richard W said,
May 5, 2015 @ 3:34 am
@ a Chinese speaker
Re: variants for Taiwan and China (both for the characters and for the pronunciation), etc. are NOT missing from that Wikipedia page
Does the Wikipedia webpage mention, for example, that
– Meitnerium is pronounced mài in Chinese?
– the name for Bohrium is bō in the PRC and pō in Taiwan?
I couldn't see pronunciation details such as these on the Wikipedia page, but they are on the list shown above.
The list on this page was essentially constructed by starting from a Chinese Wikipedia list of elements, and then adding missing information: the pronunciation of the characters (in both the PRC and Taiwan) and English names. That makes it a more comprehensive list, and one that is apparently not easily found elsewhere.
Wikipedia includes both China and Taiwan variants but not pronunciations.
The following list includes pronunciations, but not Taiwan variants. Nor does it include the name of some elements such as lì (for Livermorium).
http://retype.wenku.baidu.com/view/c6c0d883ec3a87c24028c4d1.html?re=view
The list shown in this Language Log post (above) has the PRC characters, the Taiwan characters, and their pronunciations, all combined in a single list.
flow said,
May 5, 2015 @ 5:28 am
@Michael Watts "they appear to use their own extension of SVG to define characters. That's not really workable as a transfer format (like unicode), but I don't see why a sequence like ⿰金⿱雨田 couldn't be interpreted into that format on the viewing end."
—Thanks for the links!—CDL, i'm afraid, is more indicative of the problem than the solution, as it is seemingly little more than an abstraction over SVG that knows components and stroke types, but still needs detailed instructions where to place those things in the finished character and how to scale stroke distances (independently from scaling the stroke weights!)—which, had we a suitable algorithm, would be the solution.
In other words, CDL allows you to use vector images of what you want to appear on the screen of the far computer; in a sense that's worse than sending ⿰金⿱雨田 because at least from this formula (written in what is called IDL) a receiver can figure out what is meant is a left/right construction with 'metal' on the left, and a top/bottom part with 'rain' on top and 'field' on bottom. Essentially, BTW, this is what you get from a Unicode string containing composed characters—"an 'x' with an 'acute' on top of it and so on. The CDL effort is certainly laudable and (in restricted environments) also workable; however, finding the precise figures for locating, rearranging and scaling character components from an IDL description has not been done, yet.
flow said,
May 5, 2015 @ 5:45 am
One weird thing about the Chinese Wikipedia article about the elements is that for a number of elements (#104—109, 111) the characters (which are from the Unicode astral planes) are given as little bitmap images. Some other 'astral' characters appear as text, though. The list is certainly valuable, but it does omit a lot of information, too. In some ways the Wenyan-article https://zh-classical.wikipedia.org/wiki/%E5%8C%96%E5%AD%B8%E5%85%83%E7%B4%A0 is even better, if much shorter. The Japanese version http://ja.wikipedia.org/wiki/%E5%85%83%E7%B4%A0%E3%81%AE%E4%B8%AD%E5%9B%BD%E8%AA%9E%E5%90%8D%E7%A7%B0 is quite comprehensive, but all of the 'astral' Unicode code points are split into components.
Jongseong Park said,
May 5, 2015 @ 7:55 am
@J. Goard: The most interesting to me is 주석 (ju-seok) 'tin', which is 柱石 'pillar-stone'.
Actually, the correct hanja for the Korean word 주석 juseok for "tin" is 朱錫 "red tin", using the same character 錫 (석 seok) used in Chinese according to the list above. There are four commonly used Sino-Korean homophones for 주석 juseok—柱石 "pillar", 朱錫 "tin", 主席 "(state) chairman", and 註釋 "annotation". Traditionally, the last of these is pronounced with a long vowel in the first syllable, but since lexical length is lost in the majority of Standard Korean speakers today, all of these are perfect homophones in Korean. Of course, there is no trouble distinguishing them through context.
주석 朱錫 juseok "red tin" follows an established pattern of combining a character for colour with a character for a kind of metal to name specific metals. 백금 白金 baekgeum "white gold" for platinum is another example. In everyday usage, we also often see the alternative name 황금 黃金 hwanggeum "yellow gold" for gold, though the chemical element is always simply 금 金 geum "gold". Less common is the alternative name 백은 白銀 baegeun "white silver" for silver, which is almost always just 은 銀 eun "silver".
A well-known example of this naming pattern is 청동 靑銅 cheongdong "green copper" for bronze. 황동 黃銅 hwangdong "yellow copper" is a rare alternative name for brass, usually called 놋쇠 notsoe, and although I hadn't heard of it before, 청금 靑金 cheonggeum "blue gold" is apparently an alternative name for lead, which is usually 납 nap.
Victor Mair said,
May 5, 2015 @ 9:07 am
I am grateful to commenter flow for the clinic on the state of the field with regard to the construction and transmission of new and extremely rare characters.
Jongseong Park said,
May 5, 2015 @ 11:12 am
@J. Goard: 봉소 (bong-so) 'boron', from 봉사 (bong-sa) '~sand = borax'
To correct the typo above, boron is 붕소 硼素 bungso in Korean. Borax is 붕사 硼沙/硼砂 bungsa, though I confess that I am not familiar with the term in either Korean or English. Note the vowel ㅜ u instead of ㅗ o.
Christopher S said,
May 5, 2015 @ 11:54 am
I just wanted to point out that not all of the elements in Japanese are katakana gairaigo. The element names can be split into four categories:
1. Onyomi, all kanji (e.g. nitrogen 水素 suiso)
2. Onyomi, but some or all of the characters are commonly replaced by katakana (e.g. boron ホウ素 (硼素) houso)
3. Kunyomi (e.g. lead 鉛 namari)
4. Pure katakana gairaigo (e.g. helium ヘリウム heriumu)
It's worth noting that although the majority of the element names are katakana gairaigo, a great deal of the most important elements use onyomi kanji names, e.g. hydrogen, carbon, nitrogen, and oxygen (水素、炭素、窒素、酸素.)
Victor Mair said,
May 5, 2015 @ 5:59 pm
@Christopher S
Thanks for pointing that out about katakana gairaigo vs. other types of names for some of the elements.
Jeff W said,
May 5, 2015 @ 9:17 pm
flow:
I’m definitely not an expert on such things but I think CDL is intended as the detailed instructions—it’s not really a character generator (although you can create characters with it); it’s just a highly compact, abstracted way of describing the characters. The descriptions themselves are in no way intended to be readable (like you can, apparently, with IDL). (Personally, I like something like SCML scheme proposed about eight years ago by then-Dartmouth undergrad Daniel Peebles a bit more—it allows analysis of already-created characters in terms of the components in a way that the CDL scheme wouldn’t, for example—but so far it appears to be entirely conceptual.) Given the advantages of some kind of descriptive language for Chinese characters, on the one hand—for analyzing and generating characters—and the state of the art in computing nowadays, on the other, it’s too bad that so little apparent progress has been made.
malcolm said,
May 10, 2015 @ 8:51 pm
"… new elements, which usually only exist for fractions of fractions of seconds"
Not just "a fraction of a second"?, or "a fraction of a fraction of a second"? – the second form emphasizing the smallness of the time interval.
K. Chang said,
May 10, 2015 @ 9:58 pm
Found this table:
http://www.ptable.com/?lang=zh
They got only up to 112.
Stephan Stiller said,
May 11, 2015 @ 7:11 am
About the first paragraph of E-Ping Rau's comment: I've been hesitating over bringing up the issue of descriptive versus prescriptive pronunciations in this matter. On the one hand people will guess certain pronunciations that are inaccurate. On the other hand, for such technical terms I assume that people are more likely to regard dictionaries as absolutely authoritative.
1. Guessing
How do people guess the pronunciation of a character with phonetic component P?
First of all, sometimes there isn't much to guess wrong. 镄 (fermium) is fèi, and 费 fèi occurs in no commonly used character as a proper component. ("Proper" here is used in the mathematical sense of "not identical to the whole": Of course 费 is graphemic component of 费 itself.)
When one has to guess, the only sensible strategies are to guess the pronunciation of P (very often (though not always) the component on the right-hand side) or of a larger character which contains P as phonetic component. It seems like people will intuitively consider all those characters and guess the most frequent pronunciation. Quite often this strategy works; let's proceed to some informative case studies:
Following this strategy, people ought to guess either kàng or háng for 钪 (scandium); the former (also the pronunciation of the component 亢) happens to be the right answer. In fact, they will remember this element from high school (so there's no need to guess in this particular case).
An example where the correct pronunciation is not identical to P but to that of a character containing P as a proper component would be 锘 nuò (nobelium). The phonetic component 若 is pronounced ruò in isolation and in 偌 (together with the infrequent character 箬). The correct pronunciation nuò is found only in 诺 (and in the interjection 喏 here, which is nuò and (I think) also nuó), which happens to be the most frequent character containing 若, aside from 若 itself. It's so frequent that people will guess nuò.
It isn't surprising that some sort of token frequency matters, not just type frequency.
The pronunciation of 咅 is shown in 汉语大字典 (a big character dictionary) as pǒu. The grapheme isn't in use as an independent character in modern Chinese, so this 广韵-based pseudo-Mandarin is pretty useless and meaningless (广韵 is an old dictionary with information on pronunciation in Middle Chinese). (Isn't it interesting how rarely lexicographic resources explicitly indicate an honest "unclear" and instead try to force answers? Why does every character have to have a monosyllabic Mandarin pronunciation?) For characters containing this as a phonetic component, good guesses would be bèi (in 倍, 焙, 蓓) and péi (in 陪, 培, 赔). Interesting that 锫 (berkelium) is (prescriptively) the latter; this is consistent with the lower frequency of the bèi-characters 焙 and 蓓, but bèi would match the English better. In any case, bèi would conflict with 钡 (barium).
Now to a particularly interesting case. The phonetic in 铊 (thallium) is 它 tā, but because characters properly containing this phonetic are nearly all pronounced tuó (佗, 坨, 沱, 陀, 驼, 砣, 鸵, 跎) – notable exceptions are 蛇 shé and 舵 duò – many reporters got it wrong when discussing the thallium poisoning case of Zhu Ling, guessing tuó instead of the correct (and more fitting) tā.
One should assume that the popular/intuitive guesses are accurate for the most recently discovered elements (see here), whose names were picked later, but I'll let someone else do the legwork.
2. Prescription
Dictionaries can also be wrong. Many lexicographic resources for Cantonese will show 錫 (tin) as sek3, even though everyone in HK says sek6. (But note that there is a vernacular Cantonese word sek3 which can be written 錫 or ⿰口錫 (o錫); it means "to kiss" or "to love in a deeply caring way".) Are there any alternative popular pronunciations in Mainland- or TWnese Mandarin aside from the ones mentioned by E-Ping Rau?
The situation is similar for some radicals from the 康熙字典, a dictionary with a well-known radical system. People might not know the "right" pronunciations. Continuing with Cantonese examples, if most people (more precisely: those who have an opinion) will pronounce 疋 (radical 103) as pat1 it doesn't really matter that some sources insist on so1. Or: 虍 (radical 141) as fu2 (popular) vs fu1 (some prescription).
That often there is no right answer one knows when one has to dig deep to find any information, plus the sources might disagree. Every so often noone seems to know or the sources are silent. A descriptively correct statement then would be "noone really knows but people might guess X". Most (living and written) sources have no information on 疒 (radical 104), and it doesn't mean much that that there exists at least one source showing the pronunciations nik6 and jim5 each. (There are polysyllabic names for many radicals and graphemes, but let's not get into that now.) Going back to my point above that Chinese linguists and dictionaries like to pretend that everything has a modern pronunciation, in this particular case it would be nice for resources to have a short entry telling me that this grapheme doesn't have well-known Cantonese and Mandarin pronunciations.
3. Disambiguation
Now to a different topic: How can one be clear? Without any claims of being complete:
It's strange that there aren't polysyllabic alternatives in Chinese to deal with the issue of homophones of chemical elements.
(A note about the idea that monosyllabic names make it "easier to memorize the table": Memorizing the table of elements isn't useful by itself, and rattling off a list of syllables doesn't mean one can also write the characters.)
For metals, one can say 金属<element> ("the metal <element>"). For gases people will often say <element>气, but that strictly speaking denotes the element in the gaseous state. In general, one can say <element>元素 ("the element <element>") to make clear that one is talking about chemical elements. This doesn't help with distinguishing 硒 (selenium) and 锡 (tin) though, which are both xī (and then there's 矽, silicon). (Credit for identifying this homophone pair/triple goes to François Demay.) One can certainly say something like <number>号元素 ("element #<number>"), but spelling the Roman-letter symbols is an easier method.
Finally, after all the criticism of Chinese characters, we should keep in mind that the language of chemistry, like that of mathematics, relies heavily on symbols and other notation. Sometimes it's okay for something to not be serializable/parsable; it would be sad if such formalisms had to be limited by the requirement for everything in them being reducable to phoneme strings in a straightforward way. Human language is limited after all, and there's a reason these disciplines use special terms, abbreviations, symbols, and diagrams.