Sirte, Texas

« previous post | next post »

According to Ben Zimmer, I'm writing from the front lines. But it's pretty quiet here, sitting at home in Texas, looking at tweets that have come out of Libya in the last couple of weeks. And somehow I don't think I'll be the first twitterologist to suffer from combat fatigue. Maybe that's because my students Joey Frazee and Chris Brown, together with our collaborator Xiong Liu, have been the ones doing computational battle in our little research team. That and the fact that nobody is firing mortars around here.

Yet quiet as it is where I'm sitting, it's a startling fact that today it's easy to hear far off clamor, to listen to the online noise made by thousands of ordinary people. Ordinary people in war zones. What are those people thinking?

That's the basic question. What are all those people thinking? One can't help wondering. And yet the wondering doesn't lead anywhere. All the social media chatter is tantalizing, but if you're anything like me, plugging into it just gives you a headache. Lots and lots of voices, no easy way to tell who or what matters, no easy way to pick out a tune above the cacophony. No easy way to even understand the languages of most of those people directly affected by wars that are raging.

It's against that background that a group of us have been studying ways to make sense of large amounts of  language data generated by people on the ground in Libya. If you're interested in details, I've made a very preliminary report available here. We've been using a straightforward three step process: (i) get lots of tweets in Arabic from roughly the right area, (ii) machine translate them, and (iii) analyze the results by counting various types of words that occur in the translations. A crude process, but it has the obvious advantage of using only tools that exist right now. And it gets results.

To see what I mean, look at the following graph.  It shows how the use of positive and negative emotion words in about 5000 tweets changed in Libya just over a week ago: the higher the peak, the more positive the overall sentiment. Without me telling you any more, you can see from the graph exactly when something remarkable happened:

Libya sentiment graph

You won't be surprised when I tell you what event coincides with the most obvious peak in positive emotion as well as in volume of tweet traffic: Gaddafi's death. Specifically, the vertical dashed black line marks the time when news of Gadaffi's capture and death were first made public. And I'm just realizing how much I like the way that Joey, who made this graph, marked the high volume periods of tweets in red. The deposed leader's blood seems to drip from the sharp tip of the Libyan people's joyous cries. (For caveats, more facts, and less fervid embellishment, see the report!)


  1. AntC said,

    October 31, 2011 @ 3:31 am

    (ii) machine translate them
    errm, why not analyse the texts in Arabic? (The keywords will have the same frequencies.) Doesn't the machine translation risk confounding the data?

    We have three Arabic speakers working with us. One is a coauthor on the report I pointed readers to, and has been checking and interpreting the results.

    As noted in the report, we already have tools for doing some analyses directly in Arabic, but not for the analyses we performed here. We analyzed the data on multiple dimensions (positive and negative emotion, religion, anger, death) with lexicons that do not yet have counterparts in Arabic. We would like to develop those Arabic counterparts. But for rapid analysis of very recent events, that wasn't an option.

    And yes, going into this we had no idea to what extent machine translation would confound this data set. Which is why we have Arabic speakers to check the results. Results of checking so far: I'd give us a B+. (Or maybe an encouraging A- for effort.) That is to say, and as noted in the report, the methodology certainly introduces noise, but not so much as to hide the signal.
    – dib

  2. D said,

    October 31, 2011 @ 4:23 am

    Yeah… Aren't there any Arabic-speaking linguists/linguistics students who could've helped you here? Sorry to say, but this approach feels slightly dubious, and a tad arrogant.

    I'd be disappointed if I found myself busy with research that was not slightly dubious and a tad arrogant, so I thank you for your kind support.

    As for your biased question: it is based on a faulty assumption, as you would have found out if you had read the report. One coauthor is an Arabic speaker who's been evaluating the results, and we're working with other Arabic speakers. The report also contains discussion of multiple reasons to be dubious, though unfortunately no direct proof of arrogance. Perhaps readers will supply that.
    – dib

  3. The Ridger said,

    October 31, 2011 @ 4:55 am

    He did say it was "crude" – and I'm willing to allow that they looked for Arabic speakers. Maybe they'll even get some now that they're doing their first reporting.

    Or maybe we already have some. -dib

  4. GeorgeW said,

    October 31, 2011 @ 5:49 am

    Thanks for the interesting post.

    At the risk of piling on, I question if the English translations would have the same emotional force as the Arabic words and answer the basic question, "What are all those people thinking?"

    I would also be interested to know if any weight was given to the words in each category as some words have more emotional force than others.

    It's undoubtedly the case that emotional force of words in Arabic is often very different to the force of those words in their translations. However, our preliminary results show that the emotional force is often comparable, which is why the results are still usable.

    Re. your weighting question: no, our approach doesn't have different weighting for different words. It would certainly be good to have such weightings, if they were known. Note that a great deal of sentiment analysis work, like ours, is based on simple word counts. However, sentiment classifiers derived using machine learning techniques commonly do have implicit weightings on words and other features. Such techniques were not available to us since we did not have a relevant large set of examples with gold standard sentiment annotations. -dib

  5. Eric P Smith said,

    October 31, 2011 @ 6:09 am

    If the vertical dashed black line marks the time when news of Gadaffi's capture and death were first made public, then why does the graph rise sharply just before that time?

    We've been looking at the timing, and so far didn't find anything very significant. Around the time of the APR announcement we used for that black line, sentiment was already changing, but the APR announcement wasn't the first announcement of Gadaffi's capture. We still need to look closely at the content of the pre-announcement positive-sentiment tweets. -dib

  6. jo said,

    October 31, 2011 @ 7:46 am

    @Eric P Smith

    The explanation of the vertical dashed black line in the post above ("the time when news of Gadaffi's capture and death were first made public") is subtly different from that in the report itself, where it is described as "the time Gadaffi's death became official" (p.2). Considering the latter, presumably the peak before the black line is mostly due to a flood of speculation and early reports about his death or capture that preceded the official confirmation.

  7. Trimegistus said,

    October 31, 2011 @ 7:47 am

    Eric: It's either the spacing between data points, or Libyan telepaths.

  8. Lauren said,

    October 31, 2011 @ 3:59 pm

    Thanks for sharing this data so soon after the events in question – it's great to see people not only using the internet as a data source, but also as a way of sharing research!

    My only concern would be that people don't necessarily always use "positive" words when they're celebrating "positive" events – especially when those events involve death like this one. I guess that those tweets that are particularly negative "death to the evil war-merchant" etc. get buried by the general response.

    That is an acute comment. We have in fact observed (and this is discussed briefly in our report) that some of what is automatically classified negative because it involves death-related issues, would be taken by a human reader to represent positive emotion. My own feeling is that the whole idea of classifying simply as positive or negative is too course-grained for many purposes, though it still could have its uses.

    Also, thanks David for one of the most entertaining comments sections I've seen in a blog post for a long time!

    Geoff Pullum's comment threads are infinitely more entertaining, but you need to use your imagination. -dib

  9. Chad Nilep said,

    October 31, 2011 @ 7:23 pm

    I assume that the tweets had to be machine translated into English because LIWC, the database used to categorize words, uses English. This was, however, not entirely clear from my very quick scan of the preliminary report.

    Thats basically right. My UT colleagues recently released Arabic LIWC, but it doesn't yet cover the categories we were interested in for this study. -dib

  10. Janice Byer said,

    November 6, 2011 @ 5:14 pm

    The CIA is tasked with, as you probably know, among other duties, monitoring the political temper of foreign populations for the purpose of compiling intelligence for federal agenies tasked with keeping our foreign policy on target (not a pun!)

    Bet dollars to doughnuts they'll be stealing Professor Beaver's idea.

RSS feed for comments on this post