{"id":1942,"date":"2009-12-06T17:04:53","date_gmt":"2009-12-06T21:04:53","guid":{"rendered":"http:\/\/languagelog.ldc.upenn.edu\/nll\/?p=1942"},"modified":"2009-12-07T09:31:12","modified_gmt":"2009-12-07T13:31:12","slug":"the-tiger-woods-index-one-more-time","status":"publish","type":"post","link":"https:\/\/languagelog.ldc.upenn.edu\/nll\/?p=1942","title":{"rendered":"The Tiger Woods Index, one more time"},"content":{"rendered":"<p>James Delingpole, \"<a href=\"http:\/\/blogs.telegraph.co.uk\/news\/jamesdelingpole\/100018847\/climategate-goes-uber-viral-gore-flees-leaving-evil-henchmen-to-defend-crumbling-citadel\/\/\">Climategate goes uber-viral, Gore flees leaving evil henchmen to defend crumbling citadel<\/a>\", The Telegraph, 12\/4\/2009:<\/p>\n<p style=\"padding-left: 30px;\"><span style=\"color: #ff0000;\">Climategate is now huge. Way, way bigger than the Mainstream Media (MSM) is admitting it is \u2013 as Richard North demonstrates <a href=\"http:\/\/eureferendum.blogspot.com\/2009\/12\/tiger-woods-index.html\">in this fascinating analysis<\/a>. Using what he calls a Tiger Woods Index (TWI), he compares the amount of interest being shown by internet users (as shown by the number of general web pages on Google) and compares it with the number of news reports recorded. The ratio indicates what people are really interested in, as opposed to what the MSM thinks they ought to be interested in.<\/span><\/p>\n<p><!--more--><br \/>\nRichard North's idea is to take the ratio of Google web hits to Google News hits; he gets 22.5 million web hits vs. 46,025 news hits for Tiger Woods (a ratio of 489), and compares some other topics (some of them a bit UK-centric):<\/p>\n<p>1. Climategate: 28,400,000 \u2013 2,930 = 9693<br \/>\n2. Afghanistan: 143,000,000 \u2013 154,145 = 928<br \/>\n3. Obama: 202,000,000 \u2013 252,583 = 800<br \/>\n4. Tiger Woods: 22,500,000 \u2013 46,025 = 489<br \/>\n5. Gordon Brown: 12,300,000 \u2013 37,021 = 332<br \/>\n6. Climate change: 22,200,000 \u2013 68,419 = 324<br \/>\n7. Sally Bercow: 25,000 \u2013 86 = 290<br \/>\n8. David Cameron: 545,000 \u2013 4837 = 113<br \/>\n9. Meredith Kercher: 261,000 \u2013 3,471 = 75<br \/>\n10. Chilcot Inquiry: 125,000 \u2013 4,350 = 29<\/p>\n<p>And the National Review Online has a <a href=\"http:\/\/planetgore.nationalreview.com\/post\/?q=MGM5OTAwMWY3YWYyYWY0ZWRmZDczNTdjYjY4OTc0YjI=%0A\">pie chart<\/a>.<\/p>\n<p>I'm glad to see this evidence of widening interest in empirical punditry. I'm a <a href=\"http:\/\/itre.cis.upenn.edu\/~myl\/languagelog\/archives\/000043.html\">long-time enthusiast<\/a> for web-search counts as a proxy for various cultural and political metrics &#8212; but I've also been warning for years that such metrics are <a href=\"http:\/\/itre.cis.upenn.edu\/~myl\/languagelog\/archives\/001837.html\">tricky<\/a> <a href=\"http:\/\/itre.cis.upenn.edu\/~myl\/languagelog\/archives\/001840.html\">and<\/a> <a href=\"http:\/\/itre.cis.upenn.edu\/~myl\/languagelog\/archives\/002511.html\">not<\/a> <a href=\"http:\/\/itre.cis.upenn.edu\/~myl\/languagelog\/archives\/002549.html\">always<\/a> <a href=\"http:\/\/itre.cis.upenn.edu\/~myl\/languagelog\/archives\/000194.html\">reliable<\/a>, and I always try to find confirmation from alternative approaches.<\/p>\n<p>In this case, an obvious thing to try is Google Trends, which gives numbers not from web pages but from web searches. This is a more direct indication of interest levels in the general population, many more of whom search the web than write for it; and I believe that the counts are more or less veridical (at least for the N highest-ranked searches) rather than estimated by complex and fallible algorithms.<\/p>\n<p>Here's the graphical display of four relevant searches over the past 30 days. (Click on the image for a larger version; look <a href=\"http:\/\/www.google.com\/intl\/en\/trends\/about.html#7\">here<\/a> for information about the meaning of the numbers.)<\/p>\n<p><a href=\"http:\/\/languagelog.ldc.upenn.edu\/myl\/GoogleTrends1.png\"><img decoding=\"async\" title=\"Click to embiggen\" src=\"http:\/\/languagelog.ldc.upenn.edu\/myl\/GoogleTrends1.png\" alt=\"\" width=\"475\" \/><\/a><\/p>\n<p>Google Trends also allows you to download the data in spreadsheet form, which is helpful because the overall numbers for the past 30 days, relative to the number of searches for climategate (climategate 1, \"Tiger Woods\" 33, Afghanistan 7, Obama 32) are a bit unfair to climategate, since it had non-zero (recorded) searches only for Nov. 23 to Dec. 3 (the last day for which the numbers are now available), while the other search terms had non-zero values for all the surveyed days.<\/p>\n<p>So here are the numbers from downloading a Google Trends CSV file with fixed scaling, and pulling out just those 11 days:<\/p>\n<table style=\"text-align: center;\" border=\"1\">\n<tbody>\n<tr>\n<td><\/td>\n<td>climategate<\/td>\n<td>Tiger Woods<\/td>\n<td>Afghanistan<\/td>\n<td>Obama<\/td>\n<\/tr>\n<tr>\n<td>11\/23<\/td>\n<td>2<\/td>\n<td>4<\/td>\n<td>12<\/td>\n<td>40<\/td>\n<\/tr>\n<tr>\n<td>11\/24<\/td>\n<td>4<\/td>\n<td>4<\/td>\n<td>12<\/td>\n<td>64<\/td>\n<\/tr>\n<tr>\n<td>11\/25<\/td>\n<td>4<\/td>\n<td>4<\/td>\n<td>10<\/td>\n<td>282<\/td>\n<\/tr>\n<tr>\n<td>11\/26<\/td>\n<td>2<\/td>\n<td>4<\/td>\n<td>10<\/td>\n<td>118<\/td>\n<\/tr>\n<tr>\n<td>11\/27<\/td>\n<td>4<\/td>\n<td>166<\/td>\n<td>8<\/td>\n<td>66<\/td>\n<\/tr>\n<tr>\n<td>11\/28<\/td>\n<td>4<\/td>\n<td>196<\/td>\n<td>8<\/td>\n<td>50<\/td>\n<\/tr>\n<tr>\n<td>11\/29<\/td>\n<td>6<\/td>\n<td>184<\/td>\n<td>10<\/td>\n<td>44<\/td>\n<\/tr>\n<tr>\n<td>11\/30<\/td>\n<td>6<\/td>\n<td>200<\/td>\n<td>14<\/td>\n<td>44<\/td>\n<\/tr>\n<tr>\n<td>12\/01<\/td>\n<td>6<\/td>\n<td>160<\/td>\n<td>18<\/td>\n<td>52<\/td>\n<\/tr>\n<tr>\n<td>12\/02<\/td>\n<td>6<\/td>\n<td>302<\/td>\n<td>32<\/td>\n<td>88<\/td>\n<\/tr>\n<tr>\n<td>12\/03<\/td>\n<td>6<\/td>\n<td>322<\/td>\n<td>20<\/td>\n<td>50<\/td>\n<\/tr>\n<tr>\n<td>Total<\/td>\n<td>50<\/td>\n<td>1546<\/td>\n<td>154<\/td>\n<td>898<\/td>\n<\/tr>\n<tr>\n<td>Ratio to \"climategate\"<\/td>\n<td>1<\/td>\n<td>30.9<\/td>\n<td>3.1<\/td>\n<td>18.0<\/td>\n<\/tr>\n<tr>\n<td>Ratio to \"Tiger Woods\"<\/td>\n<td>0.03<\/td>\n<td>1<\/td>\n<td>0.10<\/td>\n<td>0.58<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>So let's compare North's Google News counts (and current Google News counts, just for grins) to the Google Trends figures:<\/p>\n<table style=\"text-align: center;\" border=\"1\">\n<tbody>\n<tr>\n<td><\/td>\n<td>News (North)<\/td>\n<td>News (Now)<\/td>\n<td>Trends<\/td>\n<td>North\/Trends<\/td>\n<td>Now\/Trends<\/td>\n<\/tr>\n<tr>\n<td>climategate<\/td>\n<td>2,930<\/td>\n<td>5,197<\/td>\n<td>50<\/td>\n<td>58.6<\/td>\n<td>103.9<\/td>\n<\/tr>\n<tr>\n<td>Tiger Woods<\/td>\n<td>46,025<\/td>\n<td>52,833<\/td>\n<td>1,546<\/td>\n<td>29.8<\/td>\n<td>34.2<\/td>\n<\/tr>\n<tr>\n<td>Afghanistan<\/td>\n<td>154,145<\/td>\n<td>166,063<\/td>\n<td>154<\/td>\n<td>1,001<\/td>\n<td>1,078<\/td>\n<\/tr>\n<tr>\n<td>Obama<\/td>\n<td>252,583<\/td>\n<td>288,210<\/td>\n<td>898<\/td>\n<td>281<\/td>\n<td>321<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>If we re-run the \"Tiger Woods Index\" in terms the ratio of news stories to Google Trends numbers, we see that climategate is getting between two and three times more press than Tiger is, relative to public interest (58.6\/29.8 = 1.966; 103.9\/34.2 = 3.038).\u00a0 Of course, Afghanistan racks up 1078\/34.2 = 31.5 on the TWI, and Obama's at 321\/34.2 = 9.39.<\/p>\n<p>In order to make the number comparable to North's, we need to take the ratio of the proxy for level of public attention (here Google Trends numbers) to the proxy for MSM attention (here Google News numbers).\u00a0 In order to get the numbers into the same general range, I'll re-scale North's index as the ratio of Google web hits in millions to Google News hits in thousands; and I'll scale my index as the ratio of fixed-scaling Google Trends sums to Google News hits in thousands:<\/p>\n<table style=\"text-align: center;\" border=\"1\">\n<tbody>\n<tr>\n<td><\/td>\n<td>Web (M)<\/td>\n<td>News (K)<\/td>\n<td>Trends sum<\/td>\n<td>North<\/td>\n<td>MYL<\/td>\n<\/tr>\n<tr>\n<td>climategate<\/td>\n<td>30<\/td>\n<td>5.2<\/td>\n<td>50<\/td>\n<td>5.8<\/td>\n<td>9.6<\/td>\n<\/tr>\n<tr>\n<td>Tiger Woods<\/td>\n<td>12.4<\/td>\n<td>53.0<\/td>\n<td>1546<\/td>\n<td>0.2<\/td>\n<td>29.2<\/td>\n<\/tr>\n<tr>\n<td>Afghanistan<\/td>\n<td>28.1<\/td>\n<td>169.9<\/td>\n<td>154<\/td>\n<td>0.1<\/td>\n<td>0.9<\/td>\n<\/tr>\n<tr>\n<td>Obama<\/td>\n<td>41.1<\/td>\n<td>273.2<\/td>\n<td>898<\/td>\n<td>0.2<\/td>\n<td>3.3<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>What does all that mean? My index says that major geopolitical events get a lot of press relative to the rate of web search; celebrity scandals, contrary to what you might think, not so much. And as for \"climategate\", it's kind of in between Tiger Woods and Obama, in terms of the ratio of web searches to news stories. \u00a0That seems about right to me.<\/p>\n<p>Why do Mr. North's index numbers look so different? In the first place, for things that have been in the news for years (Tiger Woods, Afghanistan, Obama), the total number of web pages out there is not a very good index of the current level of short-term public interest. A term like \"climategate\", which didn't exist until a few days ago, is a different matter. And it's certainly impressive for it to have racked up tens of millions of web hits so quickly.<\/p>\n<p>Caveat: Google's hit counts are extrapolated from samples by means of an algorithm that might over-estimate the total number of pages for a term that has increased very rapidly in the recent past. (Only 716 of the \"30,000,000\" hits are actually available in the index.) But if we take the counts at face value, then apparently there are a lot of people generating a lot of pages about climategate, but not all that many people trying to find out about it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>James Delingpole, \"Climategate goes uber-viral, Gore flees leaving evil henchmen to defend crumbling citadel\", The Telegraph, 12\/4\/2009: Climategate is now huge. Way, way bigger than the Mainstream Media (MSM) is admitting it is \u2013 as Richard North demonstrates in this fascinating analysis. Using what he calls a Tiger Woods Index (TWI), he compares the amount [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[16,26],"tags":[],"class_list":["post-1942","post","type-post","status-publish","format-standard","hentry","category-language-and-politics","category-language-and-the-media"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/posts\/1942","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1942"}],"version-history":[{"count":0,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=\/wp\/v2\/posts\/1942\/revisions"}],"wp:attachment":[{"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1942"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1942"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/languagelog.ldc.upenn.edu\/nll\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1942"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}