{"id":24596,"date":"2016-03-11T16:45:18","date_gmt":"2016-03-11T21:45:18","guid":{"rendered":"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=24596"},"modified":"2016-03-12T06:53:14","modified_gmt":"2016-03-12T11:53:14","slug":"the-most-kasichoid-cruzian-trumpish-and-rubiositous-words","status":"publish","type":"post","link":"https:\/\/languagelog.ldc.upenn.edu\/nll\/?p=24596","title":{"rendered":"The most Kasichoid, Cruzian, Trumpish, and Rubiositous words"},"content":{"rendered":"<p>I didn't watch <a href=\"http:\/\/www.cnn.com\/2016\/03\/10\/politics\/republican-debate-transcript-full-text\/index.html\" target=\"_blank\">last night's Republican debate in Miami<\/a>. Apparently it was a relatively sober affair &#8212; there were no penis comparisons, no one called anyone else a liar or a fraud or a con-man, there was hardly even any\u00a0shouting or interrupting.<\/p>\n<p>But several people have asked for a reprise of the type of analysis that I did back in September to compare Donald Trump's lexicon with Jeb Bush's (\"<a href=\"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=21068\" target=\"_blank\">The most Trumpish (and Bushish) words<\/a>\", 9\/5\/2015). So here it is, just for the words used in that 3\/10\/2016 debate.<\/p>\n<p><!--more--><\/p>\n<p>First, the overall word counts:<\/p>\n<table border=\"1\" cellpadding=\"2\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\">Kasich<\/td>\n<td style=\"text-align: center;\">3,172<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Cruz<\/td>\n<td>3,677<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Trump<\/td>\n<td>5,114<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">Rubio<\/td>\n<td style=\"text-align: center;\">4,701<\/td>\n<\/tr>\n<tr>\n<td>All (including moderators)<\/td>\n<td>21,117<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>For each of the four candidates,\u00a0I calculated the\u00a0the \"weighted log-odds-ratio, informative Dirichlet prior\", using the algorithm described on p. 387-8 of Monroe, Colaresi &amp; Quinn \"<a href=\"http:\/\/languagelog.ldc.upenn.edu\/myl\/Monroe.pdf\" target=\"_blank\">Fightin' Words: : Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict<\/a>\", <em>Political Analysis\u00a0<\/em>2009. In each case, I used the overall word counts from the debate as the prior, and compared the selected candidate's counts to the counts of the other three candidates, given the weighting prescribed by Monroe et al.'s algorithm:<\/p>\n<p>Thus the 20 most Trumpish words were:<\/p>\n<div style=\"font-size: 10pt;\">\n<pre>i        196 (38326.2) 217 (18787.9) 470 (22256.9) 3.546\r\nvery      41 (8017.21) 17 (1471.86) 65 (3078.09) 3.248\r\ndeal      33 (6452.87) 19 (1645.02) 54 (2557.18) 2.495\r\ndon't     43 (8408.29) 33 (2857.14) 83 (3930.48) 2.358\r\nthink     31 (6061.79) 19 (1645.02) 56 (2651.89) 2.341\r\nwant      34 (6648.42) 24 (2077.92) 87 (4119.9) 2.335\r\never      14 (2737.58) 3 (259.74) 17 (805.039) 2.295\r\nnobody    10 (1955.42) 0 (0) 13 (615.618) 2.278\r\nted       11 (2150.96) 1 (86.5801) 13 (615.618) 2.257\r\ntremendous 9 (1759.87) 0 (0) 9 (426.197) 2.236\r\nlove      11 (2150.96) 2 (173.16) 13 (615.618) 2.096\r\nsomething 11 (2150.96) 2 (173.16) 17 (805.039) 2.044\r\nbid        7 (1368.79) 0 (0) 7 (331.486) 1.972\r\nmean      14 (2737.58) 6 (519.481) 24 (1136.53) 1.868\r\nit       119 (23269.5) 164 (14199.1) 318 (15059) 1.867\r\nmany      17 (3324.21) 10 (865.801) 31 (1468.01) 1.776\r\nstrong     9 (1759.87) 2 (173.16) 16 (757.683) 1.774\r\nnever     12 (2346.5) 5 (432.9) 17 (805.039) 1.755\r\nbeat       8 (1564.33) 2 (173.16) 10 (473.552) 1.678\r\ngood 17 (3324.21) 11 (952.381) 30 (1420.66) 1.668<\/pre>\n<\/div>\n<p>Each line has the form<\/p>\n<div style=\"font-size: 9pt;\">\n<pre>WORD \u00a0count1 (permillion1) count2 (permillion2) count3 (permillion3) WLO<\/pre>\n<\/div>\n<p>where<\/p>\n<ul>\n<li><strong>count1<\/strong> is the number of times the word was used by the selected\u00a0candidate<\/li>\n<li><strong>(permillion1)<\/strong> is count1 expressed as frequency per million words<\/li>\n<li><strong>count2<\/strong>\u00a0and (permillion2)\u00a0are\u00a0the\u00a0same things for the other three candidates<\/li>\n<li><strong>count3<\/strong> &amp; <strong>(permillion3)<\/strong> are the same things for the debate transcript as a whole<\/li>\n<li><strong>WLO<\/strong> is the \"weighted log odds\" as per the Monroe et al. algorithm<\/li>\n<\/ul>\n<p>So last night, the 20 <a href=\"http:\/\/languagelog.ldc.upenn.edu\/myl\/KASICH_RepublicanDebate031016.txt\">most Kasichoid words<\/a> were<\/p>\n<p class=\"p1\" style=\"padding-left: 30px;\"><span class=\"s1\">ohio you worried balanced standards ought the thank senator state mr budget secondly college kids positive trump them veteran ourselves<\/span><\/p>\n<p class=\"p1\">And the 20 least Kasichoid words were<\/p>\n<p class=\"p1\" style=\"padding-left: 30px;\"><span class=\"s1\">'s is it deal very good going never millions he bad example am love made now are think nothing tax<\/span><\/p>\n<p class=\"p1\">The 20 <a href=\"http:\/\/languagelog.ldc.upenn.edu\/myl\/CRUZ_RepublicanDebate031016.txt\" target=\"_blank\">most Cruzian words<\/a> were<\/p>\n<p class=\"p1\" style=\"padding-left: 30px;\"><span class=\"s1\">donald you clinton washington who need obama hillary defend solution hard he immigration terrorist murder ayatollah nuclear billions tax point <\/span><\/p>\n<p class=\"p1\">And the 20 least Cruzian words were<\/p>\n<p class=\"p1\" style=\"padding-left: 30px;\"><span class=\"s1\">they i it don't deal all way make no going good there than great mean get things love maybe very <\/span><\/p>\n<p class=\"p1\">The 20 <a href=\"http:\/\/languagelog.ldc.upenn.edu\/myl\/TRUMP_RepublicanDebate031016.txt\" target=\"_blank\">most Trumpish words<\/a> were<\/p>\n<p class=\"p1\" style=\"padding-left: 30px;\"><span class=\"s1\"> i very deal don't think want ever nobody ted tremendous love something bid mean it many strong never beat good<\/span><\/p>\n<p class=\"p1\">And the 20 least Trumpish words were<\/p>\n<p class=\"p1\" style=\"padding-left: 30px;\"><span class=\"s1\">need when in my the who working america american with on to century kids law debt generation retire veteran us <\/span><\/p>\n<p class=\"p1\">The 20 <a href=\"http:\/\/languagelog.ldc.upenn.edu\/myl\/RUBIO_RepublicanDebate031016.txt\" target=\"_blank\">most Rubiositous words<\/a> were<\/p>\n<p class=\"p1\" style=\"padding-left: 30px;\"><span class=\"s1\">senator in mr trump thank governor florida v.a program issue bipartisan my retire issues cruz debt century budget law miami <\/span><\/p>\n<p class=\"p1\">And the 20 least Rubiositous words were<\/p>\n<p class=\"p1\" style=\"padding-left: 30px;\"><span class=\"s1\">i very we and think tax ever hillary iran tremendous care don't beat was bid ted many donald bad deal <\/span><\/p>\n<p class=\"p1\">\n","protected":false},"excerpt":{"rendered":"<p>I didn't watch last night's Republican debate in Miami. Apparently it was a relatively sober affair &#8212; there were no penis comparisons, no one called anyone else a liar or a fraud or a con-man, there was hardly even any\u00a0shouting or interrupting. But several people have asked for a reprise of the type of analysis [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[16],"tags":[],"class_list":["post-24596","post","type-post","status-publish","format-standard","hentry","category-language-and-politics"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/posts\/24596","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=24596"}],"version-history":[{"count":10,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/posts\/24596\/revisions"}],"predecessor-version":[{"id":24607,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/posts\/24596\/revisions\/24607"}],"wp:attachment":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=24596"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=24596"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=24596"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}