A proliferation of hyphens

« previous post | next post »

In comments to "Suffer the consequences " (4/19/15), Jongseong Park and Bob Ramsey bemoaned what they considered to be the overuse of hyphens in the transliteration of Hangeul.  In a later comment, I explained that the hyphens between virtually all syllables in the transliterations were due to the Hangeul converter we've been using, which automatically inserts them.  In the future, we'll try to remove most of the hyphens.

The question of unnecessary hyphens in the transliteration of Hangeul came up before in a comment by Bob Ramsey to another post, "Is Korean diverging into two languages? " (11/6/14).

Here's what Bob said on that occasion:

Some romanized transcriptions of Korean use unnecessary hyphens; e.g.: 기름 사탕 (gi-reum-sa-tang, RR, ki-rŭm-sa-t'ang, MR); 캐러멜 (kae-reo-mel, RR, k'ae-rŏ-mel, MR). Hyphens used like that just take up space and clutter the page. Notice how the original Korean writing in Hangul has word spacing–and no hyphens! Oh, and neither RR nor MR call for them.

Yes, yes, I know: transcribers put them in to reflect the fact that the Korean writing system has syllabic features. But syllable boundaries are, for the most part, obvious without the hyphens. And besides, strictly speaking, modern Korean orthography doesn’t always reflect the phonological boundaries of the syllable anyway.

But there’s also something more important that often escapes notice: As a well-known but annoyed Koreanist once told me, too many hyphens make Korean look exotic and primitive—much the same way American Indian names and words did to Westerners who thought of them as savage or primitive—and wrote them with hyphens. Such orthographic stretching is totally unnecessary and confusing. And culturally patronizing.

The argument over whether or not to separate syllables of Chinese and Korean with hyphens is long and vexed.  For some reason, this doesn't come up with Japanese Romanization (I wonder why?).  Vietnamese, in contrast, separates almost all syllables, though occasionally one sees a hyphen in running text and rarely syllables will be joined together to form a word, though I don't know the principles for when this is done.  I've often thought that it would be much easier for readers and for computers if the syllables of Vietnamese were joined together into words with spaces between them.

The Romanization of Cantonese is similar to that of Vietnamese in that all syllables are normally separated.  To me that looks rather ungainly, and I'm always tempted to join the syllables into words, but am stymied by the tone number which follows the syllable.  It would look ungainly to have words with numbers scattered about in them.  Vietnamese, on the other hand, has a plethora of diacritical marks, so I suppose it would be easier to link up syllables into words, if the Vietnamese were so inclined, but I gather that they are not.

When I edited the massive Columbia Anthology of Traditional Chinese Literature (over 1,300 pages), I was still using Wade-Giles (W-G) Romanization, but I wanted to make it look less jumbled and more natural to non-specialists, so I got rid of all the hyphens in the Chinese transcriptions in that huge book.  But Bill Nienhauser, who is a friend of mine and was a reader of the manuscript for Columbia University Press, said that it was too daring to dispense with the hyphens from a transcription system that called for them.  So I spent a month reinserting all those thousands upon thousands of hyphens.  Arggggghhhhhh!!!!!!

Linguistically, my focus is always primarily on the word as the basic unit of grammar and lexicon, indeed of language in general, and only secondarily on syllables and morphemes as constituents of words.

I like W-G for Mandarin because phonetically it is close to IPA, a very intelligent system in terms of phonology.  But, from the beginning when I first learned it, I always hated the hyphens for all of the reasons Bob Ramsey mentioned in his comments, plus it just doesn't look like real language, to have all those jarring hyphens inserted between the syllables.  That's one of the reasons why I switched to Gwoyeu Romatzyh for a number of years, because it looks like real language, and the notion of tonal spelling appealed to me, but I later had to stop because editors wouldn't accept it in my publications, and the numerous spelling rules were too hard except for an elite few who were able to master them without undue vexation and suffering.

When Pinyin first came out, I was opposed to it because of the X's, the C's, the Q's (Cao Xueqin! — cf. W-G Ts'ao Hsüeh-ch'in) and other seemingly inelegant aspects of this Romanization, and I was joined in my dislike of the system by many colleagues, including the late Daniel Bryant of the University of Victoria.  Subsequently, after I developed a close working relationship with the Wénzì gǎigé wěiyuánhuì 文字改革委员会 (Commission for Script Reform) starting in the early 80s and I realized that the PRC was serious about Hanyu Pinyin as an alternative orthography (or perhaps I should say auxiliary orthography) for Mandarin, I accepted it as the official Romanization and became an enthusiastic supporter.  I now believe that Hanyu Pinyin is part of an "emerging digraphia".  As such, I am pleased that it has a complete set of orthographical rules for how to deal with word division, grammatical constructions, and punctuation.

The thing I liked most about Pinyin from the very beginning is that it didn't have a lot of unnecessary, obtrusive hyphens strewn through passages transcribed in it.  Still, I have met a few Chinese, even scholars, who insert hyphens in the Romanized transcription of their names, the titles of their works, and so forth.  And I know several Western Sinologists who insert hyphens in Hanyu Pinyin.  For instance, Geoff Wade:

I have  no knowledge of Korean, but I find the hyphens useful and instructive when rendering Chinese….  Not patronising at all.

Ornery, I know.

The words of English and French and German and many other languages have syllables too, but nobody thinks of putting hyphens in the words to mark each of their syllables.  For most languages written with an alphabet, the syllables of words are joined together, while hyphens are reserved for special purposes, such as to link up words that are normally separate into a single unit.  We are fortunate also to have a variety of dashes, which permit even greater flexibility with punctuation.

In general, in my own usage, and I perceive this also in the writing of many others, I find myself forming fewer compound nouns with hyphens than I did decades ago.  Instead, I have a tendency now either to write the words together without a hyphen, or simply to separate them without a hyphen.  For example, "hyperlink" instead of "hyper-link" and "pot belly" instead of "pot-belly", cf. also "bookcase" and "handmade" instead of "book-case" and "hand-made", "powder room" instead of "powder-room", and so forth.  As for Romanization of Chinese languages, I avoid hyphens as much as possible.  The phonology of these languages generally makes it clear where the syllable boundaries fall.  If there is any possible confusion, the official orthographic rules specify an apostrophe, e.g., Xi'an (two syllables) for the name of the city to distinguish it from xian (a single syllable) and jin'gan ("kumquat") to differentiate it from Jing'an (name of one of the central districts in Shanghai).


  1. Victor Mair said,

    April 26, 2015 @ 9:39 am

    Stephen Owen, ed. and tr., An Anthology of Chinese Literature: Beginnings to 1911 (Norton, 1996), has Pinyin with hyphens.

  2. Robot Therapist said,

    April 26, 2015 @ 11:03 am

    Forever; 'tis a single word!
    Our rude forefathers deem'd it two:
    Can you imagine so absurd

    A view?

    Forever! What abysms of woe
    The word reveals, what frenzy, what
    Despair! For ever (printed so)
    Did not.

    It looks, ah me! how trite and tame!
    It fails to sadden or appal
    Or solace – it is not the same
    At all.

    O thou to whom it first occurr'd
    To solder the disjoin'd, and dower
    Thy native language with a word
    Of power:

    We bless thee! Whether far or near
    Thy dwelling, whether dark or fair
    Thy kingly brow, is neither here
    Nor there.

    But in men's hearts shall be thy throne,
    While the great pulse of England beats:
    Thou coiner of a word unknown
    To Keats!

    And nevermore must printer do

    As men did long ago; but run
    "For" into "ever," bidding two
    Be one.

    Forever! passion-fraught, it throws
    O'er the dim page a gloom, a glamour:
    It's sweet, it's strange; and I suppose
    It's grammar.

    Forever! 'Tis a single word!
    And yet our fathers deem'd it two:
    Nor am I confident they err'd;
    Are you?

  3. Rodger C said,

    April 26, 2015 @ 11:31 am

    National Geographic maps used to use WG without hyphens. The effect was a bit dizzying.

  4. Anschel Schaffer-Cohen said,

    April 26, 2015 @ 1:06 pm

    It seems just a *bit* weird for a Western Sinologist to be passing judgment on whether or not something is patronizing to Chinese people.

  5. Eli Nelson said,

    April 26, 2015 @ 2:18 pm

    @Anschel Schaffer-Cohen:
    Are you referring to Victor Mair or Geoff Wade in your comment? This article says that Chinese sinologists are similarly divided between the hyphen-users and non-users; despite this, you still consider it presumptuous for a Western sinologist to infer from that fact that the use of hyphens is not inherently patronizing to Chinese people?

  6. Victor Mair said,

    April 26, 2015 @ 2:28 pm

    From Bob Ramsey:

    Well, I have to say you’ve expressed quite nicely how I feel about Chinese romanization as well. I, too, was not happy with Pinyin at first, even when the NY Times starting using it. (I remember C.T. Hsia used to rail against it, not the least of which because it made his initials become XZQ!) But I quickly warmed to it, first of all, because linguists, tending toward the Left as they usually do, picked up Mao’s romanization very early on. But it also had the advantage of no hyphens, as you say. My goodness, again as you say, people don’t use many hyphens when writing “civilized” languages like English, French, or German! Well stated, Victor.

    Here’s a little factoid, though, about hyphens used in Korean romanization: The original formulation of McR did not use hyphens when transcribing names (e.g., Kim Ilsŏng), but the Library of Congress decided to adopt them into their cataloging system solely based on the institution’s experience with WG! That’s the only reason hyphens crept into LC usage, and it’s a Sinocentric and perverse one in my humble view.

    As you know, McR is still the basis of the official romanization in North Korea, and it’s a system that also uses no hyphens. (But unfortunately, I think, it also has fewer spaces between words than in the South.)

    Btw, there’s a Language Log entry that discussed at length “The Dreaded Hyphen Disease” in the transcription of American Indian names: http://itre.cis.upenn.edu/~myl/languagelog/archives/005174.html

    You can see there some examples that Barbara Zimmer cites from a well-known source, Longfellow’s “Hiawatha”:

    “Longfellow obtained much of his knowledge from legends and stories collected by Henry Rowe Schoolcraft who was superintendent for Indian Affairs for the state of Michigan from 1836 to 1841. From an introduction to Hiawatha:

    “‘Schoolcraft married Jane, O-bah-bahm-wawa-ge-zhe-go-qua (The Woman of the Sound Which the Stars Make Rushing Through the Sky) Johnston. Jane was a daughter of John Johnston, an early Irish fur trader, and O-shau-gus-coday-way-qua (The Woman of the Green Prairie), who was a daughter of Waub-o-jeeg (The White Fisher), who was Chief of the Ojibway tribe at La Pointe, Wisconsin.’”

    Western attitudes about exoticism and “primitive” languages just ooze from the transcriptions made in those days before Boas and Sapir!

  7. TonyK said,

    April 26, 2015 @ 4:07 pm

    @Robert Therapist: It is unseemly to quote a whole poem without crediting the author: Charles Stuart Calverly (http://www.blueridgejournal.com/poems/csc-forever.htm).

  8. E-Ping Rau said,

    April 26, 2015 @ 9:54 pm

    Most transliteration systems of Taiwanese/Hokkien language do use hyphens (within word boundary only), mainly to reduce confusion as well. I see the alternative of using apostrophes (phok'ásiann instead of phok-á-siann), though I personally don't feel that apostrophes are really that much less "artificial" than hyphens……Maybe it's because I've come to accept Tâi-lô or Pe̍h-oē-jī as a writing system for Taiwanese/Hokkien instead of just a transliteration system.

    There does exists a writing system of Taiwanese/Hokkien with no hyphens and tone markers (MLT), similar to the idea of Gwoyeu Romatzyh, but due to the near constant use of exotic letters (q, x, v, f), to me it looks even much less like a real language: compare "aix suie mxkviaf laau phvixzuie" to "ài suí m̄-kiann lâu phīnn-tsuí" and I'd gladly choose the latter.

    As far as I know, transliteration systems of Cantonese and Hakka languages do it like the Vietnamese do: they separate every syllable.

  9. Matt said,

    April 27, 2015 @ 12:46 am

    Re Japanese, I noticed a while back that early editions (1873) of Hepburn's seminal Japanese-English dictionary hyphenates the vast majority of Sino-Japanese words: <GI-RON> 議論, <GIYŌ-SA> 行作. Some don't get the treatment, though, for reasons that are unclear to me: <GŌGI> 豪儀 (this one is marked "coll."; perhaps Hepburn considered it sufficiently nativized to spare it).

    OTOH, there are also suggestions that the hyphens are intended purely as "visual meta-data" marking word-parts and not necessarily intended for use in other contexts, e.g. the headwords <GO-FUKU> 呉服 vs <GOFUKU-ZASHI> 呉服差 — perhaps because "GOFUKU" is dissected directly above, Hepburn thought it unnecessary to segment it again when it appeared as part of a compound. It's hard to say exactly because the only explicit description of this part of the system is "The hyphen is always used to connect the different members of a compound word" — obviously, but the question is how he defined "compound word"…

    In any case, later editions (1888) don't do this: instead, we see <GIRON>, <GYŌSA> <GOFUKU> (but n.b. <GOFUKU-ZASHI> – this is uncontroversially a compound word, of course).

    Other details of Hepburn's romanization system changed over time (e.g. <GIYŌ> → <GYŌ> above), so it's possible he rethought the hyphenation thing too. If so, he probably wrote about that somewhere…

  10. Jongseong Park said,

    April 27, 2015 @ 5:29 am

    There are some important differences between Korean and Chinese (of any variety) that are relevant to the question of inserting hyphens to mark syllable boundaries, besides the already mentioned fact that Korean uses word spacing unlike Chinese. One is that Korean has far more polysyllabic morphemes than Chinese. In Chinese, most syllable boundaries indeed correspond to morpheme boundaries (including between bound morphemes). In Korean, if we insisted on marking syllable boundaries with hyphens, a lot of the time we would be cutting up individual morphemes into meaningless units. Writing the unanalyzable 동그라미 donggeurami "circle" (leaving aside any speculation on its historical etymology, it is unquestionably a single morpheme for today's Korean speakers) as dong-geu-ra-mi does not serve a useful purpose other than to mark what the syllable groupings are in the native orthography, which could serve in optional end-of-line hyphenation, I guess (sort of like how dictionary entries mark such hy‧phe‧na‧tion).

    Another is that Korean features greater divergence between the morphophonemes reflected in the orthography and the surface pronunciation, even if they are mostly due to regular sandhi rules. Consider the various conjugations of the verb 읽다 ikda "read":

    읽다 {ilɡ-da} → [익따] /ik.ta/RR: ikda MR: ikta
    읽어 {ilɡ-ʌ} → [일거] /il.ɡʌ/ RR: ilgeo MR: ilgŏ
    읽고 {ilɡ-ɡo} → [일꼬] /il.ko/ RR: ilkko MR: ilko
    읽는 {ilɡ-nɯn} → [잉는] /iŋ.nɯn/ RR: ingneun MR: ingnŭn

    The same stem 읽- ik would also surface as ilg-, il-, and ing- in both major romanizations for general use, as they choose to record the pronunciation rather than the morphophonemic orthography. In this case, while there could be some utility in marking the boundary between the stem and the ending (but note that we don't do this in English, e.g. as "read-s" or "read-ing"), it is undermined by the fact that the form of the stem changes according to the ending. If it's difficult to recognize morphemes in the romanization because they appear in several different forms, then it may not be worth separating them in the first place. In Chinese, the pronunciations of the syllables are far more stable (in citation form at least, and if we set aside tone sandhi) so that a morpheme like 读/讀 will be just in Standard Mandarin regardless of what sound precedes or follows it.

  11. flow said,

    April 27, 2015 @ 6:29 am

    I think that the comparatively greater vice of the Wade-Giles transcription is its use of the apostrophe to mark the fortis/lenis contrast in initial consonants. This orthographic device has been used for a number of languages, e.g. Mandarin, Tibetan, and Korean, and it has always struck me as unfortunate. Pinyin is so much better for using p/b over p'/p.

    Taken together with another unfortunate trend on the side of newspapers, book authors and street sign designers to avoid 'difficult' features like apostrophes and umlauts, it so happens that where the original WG transcription manages to distinguish between fortis and lenis onsets as well as front vs back rounded vowels (i.e. ü/u), what often comes out—in a 'simplified Wade-Giles spelling if you will—manages to be so ambiguous as to border on the meaningless.

    Take, as an example, the three letters 'chu'. In 'simplified' WG, these may signify all of (Pinyin) chu, zhu, qu, ju, where as in the original WG plan, the should be rendered as ch'u, chu, ch'ü, chü.

    As for Korean, there's a similar difficulty when it comes to transcribing /ɯ/. As such, I'd think transcribing this vowel with a letter in its own right (like, indeed, 'ɯ') or with an accented letter (like 'ŭ') is to be preferred over a digraph like 'eu'. Pragmatically, however, the digraph will be much more stable when words written with it enter the international news.

    You might say, writing 'geum' is so much easier than 'gɯm' or 'gŭm' (those are not even on the keyboard!), but, on the other hand, it is in no way apparent from the spelling of 'geum' that it transcribes 금, not 게움. Writing the former as 'geum', the latter as 'ge-um' or 'ge um' at least manages to disambiguate the spellings.

  12. Eto said,

    April 27, 2015 @ 6:38 am

    I think the reason that the hyphenization issue doesn't come up for Japanese is that, unlike Korean and the Chinese languages, Japanese isn't traditionally written with a one-graph-one-syllable orthography. Japanese words are traditionally broken up into morae (perhaps due to influence from the writing system), and Japanese hyphenated on the mora level is pretty silly. Hyphenly inclined Japanese speakers and linguists would write 先生 as "se-n-se-i", which looks ridiculous. Well, even more ridiculous than hyphens in Chinese and Korean.

    The interesting thing about delimiters in romanized Japanese to me is the question of where to put the word boundaries. Luckily, for reasons similar to why romanized Japanese is usually not separated into syllables, syllables in romanized Japanese are almost always bunched into words, though where the boundaries lie can vary. For example, consider the following sentence:

    彼にやらせてもらった。(I got him to do it.)

    In my experience, native Japanese speakers tend to space the romanized version of that sentence as such:

    Kareni yarasetemoratta.

    While westerners, both academic and non-academic, tend to space it like this:

    Kare ni yarasete moratta.

    This is interesting, because the former spacing suggests that kareni is some sort of dative form of the word kare, while the latter spacing suggests that ni is more like a preposition (well, postposition) that shows agency. Similarly, yarasetemoratta suggests that we are using a single form of the verb yaru, while yarasete moratta suggests that the second half is some kind of modal or helping verb. This might be just semantic quibbling, but I think it shows some differences in how each side models the language in their minds.

    I believe pinyin uses the second spacing method (so 我的房子 becomes wǒ de fángzi, not wǒde fángzi), but maybe you could push for a halfway solution, where instead of choosing between wǒ de fáng-zi and wǒ de fángzi, we can have wǒ-de fángzi. I've certainly seen that used with Japanese at least once.

    Personally, I despise the usage of hyphens to separate syllables in Chinese and Korean — not just because it looks patronizing, but because it's jarring for hyphens to be used as syllable delimiters when in, say, English, they aren't used that way at all. A hyphen in English means "hey, these two words are closer than you'd expect", or "I would've liked to write this as one word, but it would have been confusing". I also dislike the use of apostrophes to separate otherwise confusing strings for the same reason. It'd be great to just leave something like Xian un-separated (after all, English readers don't have any trouble recognizing that "hothead" is "hot-head" and not "ho-thead"). But if you really want to show that Xian is two syllables, here's where a hyphen would actually work: Xi-an.

  13. Jongseong Park said,

    April 27, 2015 @ 8:08 am

    @Bob Ramsey: The original formulation of McR did not use hyphens when transcribing names (e.g., Kim Ilsŏng), but the Library of Congress decided to adopt them into their cataloging system solely based on the institution’s experience with WG! That’s the only reason hyphens crept into LC usage, and it’s a Sinocentric and perverse one in my humble view.

    Very interesting. I've wondered for a while why the Library of Congress version of McCune–Reischauer did that, though I did guess that the use of hyphens to separate the syllables of disyllabic given names was inspired by the romanization of Chinese names.

    This rule is completely unnecessary and creates much confusion. To illustrate, the current version of the LoC implementation (which has varied over the years) can be summarized as: Hyphenate a given name or pseudonym composed of two syllables that can be written in hanja when it is preceded by a family name.

    That means writing Doona Bae 배두나 裵斗娜 as Pae Tu-na, but Se Ri Pak 박세리 朴– as Pak Seri and Cha Du Ri 차두리 車– as Ch'a Turi without hyphens, since for the latter two there are no hanja for the given names. The obvious difficulty is when you don't know if a particular given name can be written in hanja or not. Without prior knowledge, it is difficult to tell which of the name 두나, 세리, and 두리 can be written in hanja—while they don't look obviously Sino-Korean, they can all conceivably be written in hanja.

    I'm trying to follow the LoC version of MR in my Korean pronunciation dictionary, and in many cases it is difficult to find out whether a given name can be written in hanja or not. The late model Daul Kim 김다울 Kim Taul? Kim Ta-ul? is an example.

    Also, 연개소문 淵蓋蘇文 is Yŏn Kaesomun and 신사임당 申師任堂 is Sin Saimdang without hyphens, since the given name in the former and the pseudonym in the latter consist of more than two syllables. The requirement that to be hyphenated, a given name or a pseudonym should be preceded by a family name means that 한석봉 韓石峯 is Han Sŏk-pong, but 석봉 石峯 by itself is Sŏkpong. These apply mostly to pseudonyms, but there are rare examples of mononymous Koreans whose given names are used without the family name: 김무정 金武亭 Kim Mu-jŏng, better known as just 무정 武亭 Mujŏng, and the recording artist 권보아 權寶雅 Kwŏn Po-a, better known as just 보아 寶雅 Poa or her preferred spelling BoA.

    It would all be so much easier if we dropped the hyphens.

  14. J. W. Brewer said,

    April 27, 2015 @ 10:00 am

    Under the prior regime (i.e. before hanyu pinyin became commonly used in English publications from non-Communist sources) the usual practice, at least in non-specialist contexts, was to hyphenate Chinese given names but not to hyphenate Chinese topnyms: e.g., Chou En-lai and Mao Tse-tung, but Peking and Szechuan. This worked reasonably well in practice, especially because in both cases what was being done was romanizing short proper names to be plunked into the middle of English texts (ideally without requiring excessive diacritical marks etc that monolingual Anglophone typesetters working against a deadline might have difficulty dealing with without screwing up). The notion that the system used for those sorts of purposes should also necessarily be the system used for transliterating arbitrarily large chunks of running text (which itself might be done for different purposes and for different intended audiences) is, I think, where problems arise, and at that point people get hung up on questions of uniformity and standardization while losing sight of the notion that people want to transliterate from non-romanized writing systems for different purposes in different contexts such that a one-size-fits-all approach may be suboptimal.

    Note the same phenomenon (different approaches, which could be called inconsistent but I think that's needlessly pejorative since each was deemed suitable for its specialized context) in the traditional approach for Korean given names v. toponyms in English text: Hyphenated Park Chung-hee vs. unhyphenated Inchon or Pusan. (Syngman Rhee is an obvious exception among well-known ROK political figures, but he also has the family-name/given-name switched into Western order — I assume this was his own usage picked up during his years of exile.)

    The lack of hyphenation for Japanese was noted above, but without an explicit hypothesis — did they really seem "less exotic" to Westerners early on or did differences in phonology/morphology obvious to Westerners early on make the difference. (NB that many of the proper names of the "Japanese" characters in The Mikado are hyphenated, but they make no attempt to be actually Japanese-sounding, and I think the standard claim is that even a London audience of the Victorian era understood that these were meant as jokes.)

  15. Matt said,

    April 27, 2015 @ 11:07 am

    Yeah, it's probably safe to assume that even the sheltered audiences of Victorian London understood that names like "Pish-Tush" and "Yum-Yum" were jokes!

    It is interesting that Gilbert reached for a pseudo-Chinese name style to make those jokes, rather than a pseudo-Japanese one — just easier to get a laugh that way, or did it reflect relative levels of knowledge about what names were like in the two countries? "Titipu" is at least unambiguously Japanese, I suppose.

    Incidentally, the Portuguese missionaries of earlier centuries didn't hyphenate Sino-Japanese words either. (They also tended to append particles to words rather than write them separately: <vareua Xinanono cuniuo deta toqi cara> = ware wa Shinano no kuni o deta toki kara in Hepburn.)

  16. Jongseong Park said,

    April 27, 2015 @ 11:12 am

    @J. W. Brewer, in addition to 이승만 Syngman Rhee, you also have the second president 윤보선 Posun Yun who studied at the University of Edinburgh and entrepreneur 유일한 Ilhan New who spent most of his early life in the U.S. who wrote their names without hyphenation. It's a very small sample size but I wonder if the Koreans who went abroad and chose the way their names were romanized instead of having the romanizations chosen for them by Western journalists were more likely not to use hyphenation.
    Of course, lots of Koreans who went abroad simply wrote the two syllables of their names as different words, e.g. the composer 안익태 Eak Tai Ahn (who used several different spellings throughout his career, it seems).
    During his lifetime, 박정희 was usually spelled Park Chung Hee, with a space between the syllables instead of a hyphen. The internet has a distorting effect because Wikipedia and other reference sources have retroactively standardized many Korean names to the hyphenated format.
    While this is annoying, the effort at standardizing in itself is understandable as many Koreans are themselves inconsistent and may use the different formats (e.g. Boram, Bo-ram, Bo-Ram, and Bo Ram) interchangeably. Throughout his career, the footballer 박지성 Ji Sung Park wore jerseys with his name written as "J. S. PARK" for his clubs but "J S PARK", "JI SUNG", or "JISUNG" with the national team based on whatever the policy was at the time.
    I myself spell my given name "Jongseong", as one word without spaces or hyphens, and this is how it appears on my passport. People have supplied the "missing" hyphen/space and written it as "Jong-Seong" or "Jong Seong" not only in Korea, but also at a Hong Kong hotel, and more perplexingly, at a French bank. I have to insist on the correct form each time, even if they have to print a new loyalty card and send it to me again.

  17. J. W. Brewer said,

    April 27, 2015 @ 12:36 pm

    Actually the google n-gram viewer shows "Chung Hee" as more common over a long range of time than the hyphenated alternative, albeit with the ratio shifting over the years. I must admit upon further consideration that my own memory of whether I typically saw him hyphenated versus unhyphenated in the English-language newspapers back when he was still alive and I was still a boy may well be unreliable.

    http://en.wikipedia.org/wiki/Category:South_Korean_emigrants_to_the_United_States shows a very very wide range of approaches, which presumably reflects (even if imperfectly) a wide range of personal preferences. You've got family-name first v. family-name last; you've got Western-style given name v. Korean-style given name v. both; and for Korean-style given names you have hyphenated (both with second-element-capitalized and second-element-lowercased), joined-into-one-word, and broken-into-two-words. A glorious rejection of uniformity/standardization!

  18. Rodger C said,

    April 27, 2015 @ 1:17 pm

    The Spanish novelist Ramón del Valle-Inclán, in Tirano Banderas, had a Japanese character named Tu-Lag-Thi. Not a single possible Japanese syllable in it. I recall as a grad student thinking that there was something perversely marvelous about this, even for 1926 and even, frankly, for Spain.

  19. Victor Mair said,

    April 27, 2015 @ 2:55 pm

    From Steve O'Harrow:

    For what it's worth, there have been a number of
    efforts to hyphenate romanized Vietnamese, especially
    in the French colonial period and the former Republic of
    Viet Nam. However, in the DRV and now the SRV, they
    have been entirely dropped. All syllables are wrttein as
    separate words-like "thingies" but some indication of the
    realization that there is multisyllablicity comes in the form
    of capitalization where the formal title of, say, a school
    or university might be "Trường học X" or "Dại học Y" in
    which the first word in the official title is capitalized but
    not the second and then the name of the institution is
    again capitalized.

  20. Victor Mair said,

    April 27, 2015 @ 7:47 pm

    From Bill Nienhauser:

    Also David Hawkes in his revised The Songs of the South. Lots of hyphens in the Pinyin.

  21. Jason said,

    April 28, 2015 @ 2:15 am

    There's the related phenomenon of writing English lexified creoles with English orthography and grammar, resulting in extraordinarily patronising results, ie Margaret Mead's transcription of Papua New Guinea Pisin in 1931 (found in Mühlhäusler et al's "Tok Pisin Texts: From the Beginning to the Present). The dreaded "primitive-talk" hyphen makes ample appearance in her transcription:

    "You like go along big fellow mountain eh? All right. Now you go-go-go, by and by you come along one big fellow diwai (tree). All right. Now lose-em diwai you come up along one fellow road. All right. He no good fellow road. He road nothing. All right. Now you come up along big fellow mountain. You no can cut-em. You must round-em."

    The hyphen is here just one of a number of devices of "Tonto Transcription" designed to make the speaker sound primitive and uneducated — a kind of cross-cultural eye dialect (and you thought Margaret Mead was so progressive!)

  22. Jongseong Park said,

    April 28, 2015 @ 7:14 am

    I should point out that the Revised Romanization of Korean does allow for hyphens to be inserted in case of ambiguity, e.g.:

    세운 se-un : 슨 seun
    네온 ne-on : 넌 neon
    우이 u-i : 의 ui
    성운 seong-un : 선군 seon-gun

    I personally would have preferred the apostrophe for this purpose, as used by McCune–Reischauer (at least in the Library of Congress implementation, e.g. 성운 sŏngun : 선군 sŏn'gun).

    Besides the use in separating the syllables of two-syllable given names, the Library of Congress version of McCune–Reischauer also uses hyphens in a variety of other contexts, in most cases analogous to where hyphens would be used in English:

    강원도 Kangwŏn-do "Kangwŏn (Gangwon) province"
    불한사전 Pul-Han sajŏn "French-Korean dictionary"
    남북한 Nam-Pukhan "South and North Korea"
    스물다섯 sŭmul-tasŏt "twenty-five"
    21세기 21-segi "21st century"
    삼일 운동 Sam-il Undong "March 1st Movement"

    See the LoC guidelines here.

    The Revised Romanization of Korea is much more restrictive in where hyphens can be used in such cases. Only the administrative units like 도 do, 시 si, 군 gun, etc. and 가 ga used in street names are hyphenated, in addition to the optional hyphenation of given names and ambiguous syllable boundaries already mentioned. So 강원도 is Gangwon-do but the others are "Bulhansajeon", "Nambukhan", "seumuldaseot", "21(isibil)segi", and "Samil Undong".

  23. Robot Therapist said,

    April 28, 2015 @ 8:58 am

    @TonyK Indeed. My bad, and thank you for remedying it.

  24. J. W. Brewer said,

    April 28, 2015 @ 8:58 am

    Perhaps parallel to the phenomenon mentioned in the last paragraph of the prior comment re Korean "administrative units" is that hyphenization is common-to-standard in romaji for Japanese words denoting geographical subdivisions. So Chiba prefecture (east of Tokyo) is Chiba-ken, and the apartment building I lived in as a boy in Tokyo was in Akasaka 1-chome, which was in turn in Minato-ku. (I think -ku was standardly translated "ward" and "chome" maybe as "precinct" although these are obviously approximate given historical/cultural differences in local administrative structure as compared to Anglophone societies.)

  25. Robot Therapist said,

    April 28, 2015 @ 9:02 am

    …and, worse than that, I mangled his joke. The poem actually has "longago" for "long ago".

    It seemed relevant to Victor's comments about "bookcase" and "handmade".

  26. Natalie Solent said,

    April 29, 2015 @ 8:22 am

    I don't speak either Korean or Chinese so, although I have read the comments about the use of hyphens when romanizing those languages with great interest, I am in no position to add to that aspect of this discussion. However in a sense the fact that I am someone who *isn't* steeped in the history of how Westerners transcribed those languages qualifies me to talk about how it all looks from the outside. It would not have spontaneously occurred to me to see the use hyphens as patronizing. "Exotic", possibly, but only in the sense that Korean and Chinese *are* very different languages from those familiar to me. I'd have assumed that the hyphens represented a way to convey, no doubt imperfectly but not contemptuously, some aspect of that difference.

    Maybe because I'm not American the comparison to patronizing depictions of American Indian names or speech seems far-fetched, and when I come to think of it I'm not convinced that the hyphens were at all important to what was wrong with those depictions.

    Bob Ramsey is quoted by Victor Mair as saying, "Western attitudes about exoticism and “primitive” languages just ooze from the transcriptions made in those days before Boas and Sapir!", referring to the hyphens in the personal names Oh-bah-bahm-wawa-ge-zhe-go-qua and O-shau-gus-coday-way-qua. I'd have thought it just as likely to be an honest attempt to analyse the meaning of names made on a very different pattern to European names and plain make them easier to read. It is also true that eighteenth and nineteenth century newspapers make a lot use of hyphens in English names and words where we do not.

  27. Victor Mair said,

    April 29, 2015 @ 6:51 pm

    Some options:

    table | zhuōzi

    ta-ble | zhuō-zi

    ta ble | zhuō zi


    airplane | fēijī

    air-plane | fēi-jī

    air plane | fēi jī


    Denver | Dānfó

    Den-ver | Dān fó

    Den Ver | Dān Fó


    Zhang Zonqi

    Zhang Zong-qi

    Zhang Zong-Qi

    Zhang Zong Qi


    Hiawatha | Hi-a-wa-tha

    What are your preferences?

  28. J. W. Brewer said,

    April 29, 2015 @ 7:48 pm

    On the treatment of names from Amerind languages, see this old languagehat thread http://languagehat.com/why-the-hyphens/, which in turn links to an old LL post by myl (before comments were routine at this site). The interesting point is that the hyphenated style only became common in the 19th century although attempts to transcribe such words into English orthography had been going on for two centuries prior to that.

  29. Natalie Solent said,

    April 30, 2015 @ 6:26 am

    Victor Mair,

    "Preference" isn't quite the right concept here. I acknowledge that the most important factors in the decision whether to use hyphens in romanizing foreign words are things specific to the language concerned, e.g. clarifying ambiguities or, perhaps, conveying that adjacent utterances are linked in a way that doesn't arise in English. My preference purely as a reader is for hyphenated ma non troppo, so Hiawatha rather than Hi-a-wa-tha. However if I were a learner I'd prefer Oh-bah-bahm-wawa-ge-zhe-go-qua to what Wikipedia gives as the modern spelling of the same name, Obabaamwewe-giizhigokewe, and certainly to Obabaamwewegiizhigokwe.

    Talking of learners brings me to J.W. Brewer's comment. I was very interested in the discussion at languagehat, particularly this comment from caffeind: "The feeling that long words are educated and short or segmented words are kiddie stuff that is degrading for adults seems to be ingrained in educated English-speakers. I’ve seen some Westerners earnestly urge the use of Vietnam or Hongkong as somehow more dignified than Viet Nam or Hong Kong." Actually, it wasn't ingrained in me, or very weakly ingrained.

    Caffeind suggests that this perception arises from the use of hyphens in teaching materials for children.

    Actually, having started to think about what, if anything, those deciding on how to transcribe one language into another should do in order to rein in the human tendency to consider one's own form of speech the only proper form has sparked more thoughts than I can sensibly put into a blog comment. Perhaps there are people who argue, "I won't pander to parochial English-speakers by trying to make my language conform to their limited ideas."

    With Welsh place names, the modern trend seems to be to break them up into smaller units than those shown in old maps, the opposite of the trend for American Indian (Native American? Amerind?) place names. I think this is to avoid English jokes about Welsh words that "go on forever".

  30. J. W. Brewer said,

    April 30, 2015 @ 10:54 am

    You can find plenty of instances of hyphenated "air-plane" from back circa the 1920's (when "aeroplane" was still also a major lexical rival). It's a reasonably common pattern in English for compounds to start as separate words, then become hyphenated, then drop the hyphen ("base ball" to "base-ball" to "baseball," for example — although "New-York" to "New York" goes the other direction). What typically changes over time, I think, is that the compound becomes more deeply lexicalized and, whether or not it's semantically compositional, is understood as a unit without any separate parsing of its components, and the orthography tends (probably with some lag time) to follow this. Since hyphenation in English tends to be both somewhat ad hoc and historically unstable for particular lexical items, it's maybe extra-hard (see what I did there? that was ad hoc, but not wrong) to come up with a principled and consistent way of using it to transcribe words from other languages not natively written in latin script. But it doesn't follow that "never hyphenate" is the best way of avoiding the difficulty.

  31. James Wimberley said,

    May 1, 2015 @ 11:33 am

    What's the history of the space, the standard marker of word division in modern Latin scrpts? IIRC classical Roman inscriptions don't have spaces. You sometimes get dots. By Alcuin's time, the space is the standard. But was it his innovation?

  32. ohwilleke said,

    May 1, 2015 @ 3:26 pm

    Hypenation of Korean given names is an issue in my family, where my children each have Korean double barreled middle names.

    Their birth certificates dispense with middle name hyphens (a style choice made in part because they already had a hyphenated surname), but my wife (who is from the Korean side of the family) insists that hyphens are absolutely required in transliteration of double barreled Korean name such as Hei-Hyun.

  33. Jongseong Park said,

    May 1, 2015 @ 5:33 pm

    @ohwilleke: my wife (who is from the Korean side of the family) insists that hyphens are absolutely required in transliteration of double barreled Korean name such as Hei-Hyun.

    I don't know where she got the idea, but the way these things go it's usually because the way you've seen it done sticks with you. If you grow up used to seeing hyphenated given names, you tend to accept it without question, to the point that names that are not hyphenated looks wrong to you. I was lucky to be born into a family that didn't use hyphens or spaces to write given names, but so many Koreans separate the syllables of their given names with hyphens or spaces that it is entirely believable that some people will grow up thinking that that is the only permissible way.

    However, for a variety of practical reasons (many of which are catalogued in this super-long blog post in Korean), the best solution is to do away with hyphens in writing given names. The simple fact that you are calling a hyphenated Korean name a double-barrelled name is a huge argument against it—properly speaking, Korean names are not double-barrelled since they don't combine two names like Anne-Marie or Spencer-Churchill. The syllables that constitute a Korean given name are never used on their own.

    In 2007, South Korea's passport authorities started to default to writing given names without hyphens or spaces according to this wiki article in Korean, in response to the sheer number of problems created by hyphenated or spaced given names, mainly because foreign authorities not familiar with Korean naming practices often did not recognize such forms as constituting single indivisible given names.

  34. Hwa Shi-Hsia said,

    May 4, 2015 @ 3:13 pm

    English-speaking Westerners don't seem to hyphenate personal like John-Paul and Sarah-Jane any more either. I think it would look archaic.

    FWIW I like my hyphen but that's an idiosyncracy that I wouldn't either recommend or deprecate to anyone else. From the typographical point of view, it was a pain to touch-type when I was a child with small hands. I still strike the hyphen/underscore, plus/equals, and Backspace with my ring finger rather than the little finger out of habit.

RSS feed for comments on this post