{"id":39556,"date":"2018-08-10T18:07:02","date_gmt":"2018-08-10T23:07:02","guid":{"rendered":"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=39556"},"modified":"2018-08-10T18:07:02","modified_gmt":"2018-08-10T23:07:02","slug":"hot-words","status":"publish","type":"post","link":"https:\/\/languagelog.ldc.upenn.edu\/nll\/?p=39556","title":{"rendered":"Hot words"},"content":{"rendered":"<p>It is my solemn duty to call the attention of Language Log readers to a seriously deficient BBC article:<\/p>\n<p style=\"padding-left: 30px;\">\"<a href=\"http:\/\/www.bbc.com\/capital\/story\/20180809-chinas-rebel-generation-and-the-rise-of-hot-words\">China's rebel generation and the rise of 'hot words'<\/a>\", by <span class=\"index-body\">Kerry Allen with additional reporting from Stuart Lau\u00a0<\/span><span class=\"publication-date index-body\">(8\/10\/18).\u00a0 <\/span><span class=\"publication-date index-body\"><em><br \/>\n<\/em><\/span><\/p>\n<p style=\"padding-left: 30px;\"><span class=\"publication-date index-body\"><em>Language Matters is a new column from BBC Capital exploring how evolving language will influence the way we work and live.<\/em><\/span><\/p>\n<p>Even though the article annoyed me greatly, I probably wouldn't have written a post about it on the basis of the flimsy substance of the last 23 paragraphs were it not for the outrageous first paragraph, which really requires refutation.<\/p>\n<p><!--more--><\/p>\n<p>Before I dissect the first paragraph, however, I need to point out the erroneous premises built into the capsule description of this new series following the title of the article.<\/p>\n<p>All right, \"Language Matters\" is cutesy, what with the dual nounal and verbal meanings of the second word, but it's too closely modeled on some current politically sensitive slogans for comfort.\u00a0 Then I'm troubled by the future tense of \"will influence\".\u00a0 Neither the present article nor any other article about \"evolving language\" that I can imagine will be able to predict the way we live and work in the future.\u00a0 It's hard enough just to figure out how the current stage of a language reflects the way we are living and working in the present.<\/p>\n<p>Now, moving on to the disastrous first paragraph:<\/p>\n<p style=\"padding-left: 30px;\">Mandarin Chinese is one of the most complex languages in the world. Opening a Chinese dictionary, you find around 370,000 words. That's more than double the number of words in the Oxford English dictionary, and almost three times those in French and Russian dictionaries.<\/p>\n<p>The initial sentence is incredibly lame. It says nothing.\u00a0 Flunk.<\/p>\n<p>All languages are complex in one way or another:\u00a0 phonology, morphology, grammar, syntax&#8230; &#8212; you name it.\u00a0 If it's a real language that people rely on for all of their needs and transactions, it's bound to be as complicated as life itself.\u00a0 To tell the truth, I've always felt that Mandarin is one of the simplest and easiest languages I've ever learned.\u00a0 See, inter alia, \"<a title=\"Permanent link to Difficult languages and easy languages\" href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=31341\" rel=\"bookmark\">Difficult languages and easy languages<\/a>\" (3\/4\/17) &#8212; I still owe Language Log readers the results of the survey taken in that post; I have all the data, just need to type them up.<\/p>\n<p>The second sentence is worse.\u00a0 The number of words in a language is no index of its complexity.\u00a0 The active spoken vocabulary of most people is not going to exceed much more than about 5,000 words and they might use twice that amount in their writing.\u00a0 Unless their name is William F. Buckley, Jr. or they are someone extremely rare like him, even highly educated people are unlikely to have more than 20,000 words in their spoken vocabulary and 40,000 or so words in their written vocabulary.\u00a0 <a href=\"https:\/\/kottke.org\/10\/04\/how-many-words-did-shakespeare-know\">Shakespeare knew and used 31,534 words<\/a>, though he probably knew in addition to that amount another 35,000 or so words, but didn't include them in his published works, making a total of around 65,000 words.<\/p>\n<p>Even if \"Chinese\" really did have 370,000 words, that wouldn't tell us anything about vocabulary size for individuals.\u00a0 But \"Chinese\" doesn't have 370,000 words.\u00a0 <span lang=\"zh-Latn-pinyin\"><span lang=\"zh-Hans\">The authors must have gotten their fantastical figure of 370,000 from the number of entries in <\/span><\/span><span lang=\"zh-Latn-pinyin\"><span lang=\"zh-Hans\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Hanyu_Da_Cidian\">H\u00e0ny\u01d4 D\u00e0 C\u00eddi\u01cen \u6f22\u8a9e\u5927\u8a5e\u5178<\/a> (Unabridged Dictionary of Sintic) (1986-1994), but that is a dictionary based on historical principles, and most of its entries are no longer current.\u00a0 It's hugely misleading to casually speak of \"Opening a Chinese dictionary\" in this instance, since <\/span><\/span><span lang=\"zh-Latn-pinyin\"><span lang=\"zh-Hans\">H\u00e0ny\u01d4 D\u00e0 C\u00eddi\u01cen \u6f22\u8a9e\u5927\u8a5e\u5178 (Unabridged Dictionary of Sintic) isn't just any old, typical \"Chinese\" dictionary.\u00a0 For Sinitic, it's the closest thing to an equivalent of the OED, requiring the mobilization of more than a thousand researchers over a period of nearly two decades for its compilation.<br \/>\n<\/span><\/span><\/p>\n<p>The 7th edition (2016) of the<i> <\/i><span lang=\"zh-Latn-pinyin\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Xiandai_Hanyu_Cidian\">Xi\u00e0nd\u00e0i H\u00e0ny\u01d4 C\u00eddi\u01cen<\/a><\/span><i> <\/i><span lang=\"zh-Latn-pinyin\"><span lang=\"zh-Hans\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Xiandai_Hanyu_Cidian\">\u73b0\u4ee3\u6c49\u8bed\u8bcd\u5178<\/a> (Dictionary of Contemporary Sinitic), the standard and most authoritative dictionary of Modern Standard Mandarin (MSM), has around 70,000 entries.\u00a0 That fits comfortably in the realm of vocabulary size for highly educated individuals that I described above.<br \/>\n<\/span><\/span><\/p>\n<p><span lang=\"zh-Latn-pinyin\"><span lang=\"zh-Hans\">Now, to assert that \"Chinese\" has<\/span><\/span><span lang=\"zh-Latn-pinyin\"><span lang=\"zh-Hans\"> more than double the number of words in the Oxford English dictionary\" is both bad mathematics and contrary to fact &#8212; even if we accept the fictitious claim that MSM has 370,000 words, which it most certainly does not.<i><\/i><\/span><\/span><\/p>\n<p>\"How many words are there in the English language?\" from the <a href=\"https:\/\/en.oxforddictionaries.com\/explore\/how-many-words-are-there-in-the-english-language\/\">Oxford Dictionaries website<\/a> (see especially the last paragraph):<\/p>\n<p style=\"padding-left: 30px;\">There is no single sensible answer to this question. It's impossible to count the number of words in a language, because it's so hard to decide what actually counts as a word. Is <em>dog<\/em> one word, or two (a noun meaning 'a kind of animal', and a verb meaning 'to follow persistently')? If we count it as two, then do we count inflections separately too (e.g. <em>dogs<\/em> = plural noun, <em>dogs<\/em> = present tense of the verb). Is <em>dog-tired<\/em> a word, or just two other words joined together? Is <em>hot dog<\/em> really two words, since it might also be written as <em>hot-dog<\/em> or even <em>hotdog<\/em>?<\/p>\n<p style=\"padding-left: 30px;\">It's also difficult to decide what counts as 'English'. What about medical and scientific terms? Latin words used in law, French words used in cooking, German words used in academic writing, Japanese words used in martial arts? Do you count Scots dialect? Teenage slang? Abbreviations?<\/p>\n<p style=\"padding-left: 30px;\">The Second Edition of the 20-volume <em>Oxford English Dictionary<\/em> contains full entries for 171,476 words in current use, and 47,156 obsolete words. To this may be added around 9,500 derivative words included as subentries. Over half of these words are nouns, about a quarter adjectives, and about a seventh verbs; the rest is made up of exclamations, conjunctions, prepositions, suffixes, etc. And these figures don't take account of entries with senses for different word classes (such as noun and adjective).<\/p>\n<p style=\"padding-left: 30px;\">This suggests that there are, at the very least, a quarter of a million distinct English words, excluding inflections, and words from technical and regional vocabulary not covered by the <em>OED<\/em>, or words not yet added to the published dictionary, of which perhaps 20 per cent are no longer in current use. If distinct senses were counted, the total would probably approach three quarters of a million.<\/p>\n<p>Enough said on that score.\u00a0 Now what about the other 23 paragraphs of the article, which is what it's really about &#8212; r\u00e8 c\u00ed \u70ed\u8bcd (\"hot words\")?\u00a0 In a word, it's all very confused and confusing, muddled at best.\u00a0 I pity anyone unfamiliar with Chinese who slogs through it, because they will be flooded with misinformation, imprecision, and obfuscation about what the language is today and how it works.<\/p>\n<p>The article is a veritable mess.<\/p>\n<p>The authors do offer a fair number of more or less clever paraphrase translations like \"freedamn\" for Zh\u014dnggu\u00f3 t\u00e8s\u00e8 z\u00ecy\u00f3u \u4e2d\u56fd\u7279\u8272\u81ea\u7531 (\"freedom with Chinese characteristics\") and \"smilence\" for xi\u00e0o \u00e9r b\u00f9 y\u01d4 \u7b11\u800c\u4e0d\u8bed (\"laugh without speaking\"), though some of these are flops, and one doesn't always know where they come from.\u00a0 Furthermore, they instance a lot of \"hot words\" &#8212; some of which (such as \"<a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=2858#comment-98896\">niubility<\/a>\") are decidedly cool by now &#8212; without translating them or giving an idea of what they mean.<\/p>\n<p>Jonathan Smith, in calling this article to my attention, notes:<\/p>\n<p style=\"padding-left: 30px;\">&#8230;[T]he weird thing is the screwed-up non-translations&#8230;.<\/p>\n<p style=\"padding-left: 30px;\">An important point that is not made clear is whether these \"hot words\" are Chinese with English glosses, English with Chinese glosses or some combination&#8230;. I suppose \"Chinsumers (z\u00e0iw\u00e0i f\u0113ngku\u00e1ng g\u00f2uw\u00f9 de Zh\u014dnggu\u00f3 r\u00e9n \u5728\u5916\u75af\u72c2\u8d2d\u7269\u7684\u4e2d\u56fd\u4eba)\", \"departyment (zh\u00e8ngf\u01d4 b\u00f9m\u00e9n \u653f\u5e9c\u90e8\u95e8)\", \"innernet (Zh\u014dnggu\u00f3 h\u00f9li\u00e1nw\u01ceng \u4e2d\u56fd\u4e92\u8054\u7f51)\", etc., are the latter but with no attempt whatsoever at marking the puns within the Chinese translations&#8230; whereas \"harmany (Zh\u014dnggu\u00f3 t\u00e8s\u00e8 h\u00e9xi\u00e9 \u4e2d\u56fd\u7279\u8272\u548c\u8c10)\" is the former with a (bad) attempt at marking the funny in English&#8230; etc.<\/p>\n<p>The most valuable aspects of the article are how netizens use r\u00e8 c\u00ed \u70ed\u8bcd (\"hot words\") to circumvent China's ubiquitous internet censors.\u00a0 But I'm afraid that this point will be lost in the welter of bewilderment that suffuses the entire piece, from beginning to end.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It is my solemn duty to call the attention of Language Log readers to a seriously deficient BBC article: \"China's rebel generation and the rise of 'hot words'\", by Kerry Allen with additional reporting from Stuart Lau\u00a0(8\/10\/18).\u00a0 Language Matters is a new column from BBC Capital exploring how evolving language will influence the way we [&hellip;]<\/p>\n","protected":false},"author":13,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[16,190,208],"tags":[],"class_list":["post-39556","post","type-post","status-publish","format-standard","hentry","category-language-and-politics","category-neologisms","category-puns"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/posts\/39556","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/users\/13"}],"replies":[{"embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=39556"}],"version-history":[{"count":4,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/posts\/39556\/revisions"}],"predecessor-version":[{"id":39592,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/posts\/39556\/revisions\/39592"}],"wp:attachment":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=39556"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=39556"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=39556"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}