Computational linguistics and literary scholarship

« previous post | next post »

Email from Dan Garrette:

I am a Computer Science PhD student at UT-Austin working with Jason Baldridge, but I've recently been collaborating with my colleague Hannah Alpert-Abrams in the Comparative Literature department here at UT.  We've been talking about the intersection of NLP and literary study and we are interested in looking at ways in which researchers can collaborate to do work that is valid scholarship in both fields.

There has been a flurry of writing recently about the relationship between the sciences and the humanities (see: Ted Underwood, Steven Pinker, Ross Douthat's response to Steven Pinker, etc), and a particularly interesting paper at ACL (David Bamman, Brendan O’Connor, & Noah A. Smith, "Learning Latent Personas of Film Characters") that attempts to use modern NLP techniques to answer questions in literary theory.  Unfortunately, much of this discussion has failed to actually understand or recognize the scholarship that is really happening in the humanities, and, instead, seems to assume that people in the sciences are able to simply walk in and provide answers for another field.

We would like to see truly interdisciplinary work that combines contemporary ideas from both fields, and we see the ACL paper as the perfect point of entry for a public conversation about this kind of work. Because Language Log attracts readers from many different disciplines, and because computational linguistics has played an important part of the developing field of 'digital humanities,' we thought it might be a good forum for this conversation.

We have written a short response to the ACL paper which we think might make an interesting Language Log post, and Jason suggested I send it to you to see if you were interested.  We'd be very interested to hear your thoughts and the thoughts of the greater Language Log readership. Perhaps it could even spark a conversation.

Their response:   Hannah Alpert-Abrams with Dan Garrette, "Some thoughts on the relationship between computational linguistics and literary scholarship". Click the link to see a pdf form, or read my (perhaps faulty) html rendering below.

 


 

Some thoughts on the relationship between computational linguistics and literary scholarship

Hannah Alpert-Abrams with Dan Garrette

At the Association of Computational Linguistics conference in Bulgaria last month, researchers from CMU presented a model for cinematic archetypes: “Learning Latent Personas of Film Characters.” The model uses the descriptive language of Wikipedia entries along with personal data of actors in films to automatically induce a set of character personas: the traitor, the flirt.

As a literary scholar who studies both novels and films, I read the paper with interest, curious to see how NLP researchers could advance my field. The paper opens with a gloss of Aristotle: the debate over the supremacy of plot versus characters. It goes on to situate itself within what literary scholars call archetypal theory, referencing the anthropologist Joseph Campbell and the psychotherapist Carl Jung. I was pleased to see this nod towards several millennia of thought on the nature of literature.

I was disappointed, however, to notice that the literature review failed to mention any of the work done in the latter half of the twentieth century. This is a problem for the utility of the model from the perspective of literary scholarship. The paper is clearly exemplary NLP work, as evidenced by the fact that it was published at the most prestigious conference in the field. But to be truly interdisciplinary, the project must be cutting edge in the field of literary scholarship too.

A brief history of archetypal theory: I don’t work in archetypes. That’s because no one works in archetypes anymore. Jung and Campbell developed theories that attempted to find narratives and personas that were shared by all cultures. Although their work remains historically significant, the general consensus today is that Jung’s collective unconscious and Campbell’s monomyth are overly generalized concepts. The pursuit of a universal theory that could encompass the entirety of human experience tended to force all cultures into a Western European worldview. This has since been replaced by the recognition that human experience is culturally specific.

Northrop Frye, the scholar who most famously applied archetypal theories to literature, is sadly absent from this paper. But his work can help us understand why even a smaller, more culturally specific form of archetypal theory — what Bamman et al. refer to colloquially as stereotypes — has similarly fallen out of favor in literary study. Frye’s project was modeling the meta-structures of Western literature to define basic literary tropes. Bamman et al.’s model is able to automatically induce the kinds of categories that interested Frye. The direct contribution of their model to archetypal theory is that it allows us to test our preconceived notions of stereotypical personas. Did we get it right? Did Northrop Frye?

But Frye’s work, like that of Jung and Campbell, is only rarely cited in literary study today. This is because he was interested in building textual models, but did not consider the social and historical context in which these texts were produced. More interesting to contemporary scholars than the archetypes or stereotypes themselves is the way that they reflect systems of power and oppression.

When we look at Wikipedia entries about film, for example, we would not expect to find universal, latent character personas. This is because Wikipedia entries are not simply transcripts of films: they are written by a community that talks about film in a specific way. The authors are typically male, young, white, and educated; their descriptive language is informed by their cultural context. In fact, to generalize from the language of this community is to make the same mistake as Campbell and Jung by treating the worldview of an empowered elite as representative of the world at large.

To build a model based on Wikipedia entries, then, is to build a model that reveals not how films work, but how a specific subcategory of the population talks about film.

By focusing on cinematic archetypes, Bamman et al.’s research misses the really exciting potential of their data. Studying Wikipedia entries gives us access into the ways that people talk about film, exploring both general patterns of discourse and points of unexpected divergence. We might wonder, for example, whether there are cases where a statistical model misidentifies a character, and why that misidentification occurred. If, for example, the model frequently mislabels female villains as flirts, it might suggest something about the way that Wikipedia writers think about women. More generally, we might wonder whether the race or gender of an actor skews the way that the character is represented on Wikipedia: a female hero might be described differently from a male hero, or a black hero from a white hero.

These are interesting questions, and as a literary scholar I find it exciting that there are technologies we can use to approach them. Sitting down to read 42,000 Wikipedia entries is prohibitively inefficient, and computational linguists are absolutely correct to think that their tools will be useful for the study of literature.

But literary scholarship can also inform computational linguistics, if it is fully incorporated into project development and analysis. The lack of regard for the work being done in the humanities was seen this summer in several high-profile articles on the relationship between science and the humanities, including Steven Pinker’s Science Is Not Your Enemy and Lee Siegel’s Who Ruined the Humanities? Siegel reduces literary study to the reading of great novels, while Pinker proposes a one-sided model through which science is the salvation of the humanities.

If computational linguistics is going to ‘save’ the humanities, it will do so by engaging with contemporary work already being done in the field. Had Bamman et al.’s paper been more thoroughly informed by literary theory, the authors would have started with more appropriate questions about their data and provided results that fit better into academic discourse about film. This kind of research could open new realms of possibility in both literary studies and computational linguistics.

The great thing is, to address these problems computational linguists don’t have to tackle the Norton Anthologies on their own. There are already people who have built their careers on this kind of research. Just take one of us out to coffee.


Above is a guest post by Hannah Alpert-Abrams with Dan Garrette

Let me (myl) be the first to make the obvious comment that this is very much like the debate about "universal grammar" vs. "analyze each language in its own terms, without eurocentric (or other) preconceptions". It appears that the pendulums of linguistics and literary studies have been somewhat out of phase on this point over the past few decades. I'll also add the personal opinion that I don't find either field's version of the debate very interesting, in itself — neither extreme position is likely to be adequate, and partisans of both positions are capable of providing valuable insights as well as astonishing foolishness. But I do agree that those who want to cross disciplinary boundaries need to be aware of the presuppositions of the audience on the other side of the line.



30 Comments

  1. Ted Underwood said,

    September 12, 2013 @ 2:57 pm

    As a literary historian who loves the work coming out of Carnegie Mellon's School of Computer Science lately, I have mixed feelings here. Mainly, I'm glad to see this conversation taking place. It's great for humanists to push back a bit and say "we would use this tool differently." That's useful. Also, frankly, this is an impressive conversation to see happening between groups of people who are I think mostly graduate students. It gives me a good feeling about the future.

    On the other hand, I feel there may be a slight misunderstanding here about the nature of interdisciplinary exchange. Say literary scholars in Philadelphia would like to go to Seattle. An, uh, aviation scientist builds us a plane, flies it to Palm Springs to prove it can make the trip, then flies it back to our doorstep and says "Hey, you can use this to go to Palm Springs." And we reply "Palm Springs is so 1965. I want to go to Seattle!"

    Well, perhaps technically he did misunderstand us. And yes, that could have been avoided if he had asked us out for coffee first. But instead of feeling too bad about it, I think I'm going to say a hasty "thanks!", get in that plane, and see if I can pilot it to Seattle. He can come if he likes. When we get there, we can have coffee.

  2. D-AW said,

    September 12, 2013 @ 4:09 pm

    The question is, I think, a basic one about methodology and its epistemological underpinnings, not a practical one about how to properly use or adapt a "tool." And it applies somewhat differently, I think, to different fields of literary study. Literary history, literary theory, and literary criticism, to name three, each have their own different methodology and epistemology (and in some cases, several competing or diverging ones), which may be more or less suited to the various DH contraptions currently being built for them. I think the authors here are calling for a little more attention to what it is humanists think they're doing when they're doing their humanities stuff, and a perhaps a little less focus on all the neat text processing that can be done, on the principle that not all possible things are desirable.

  3. Ray Girvan said,

    September 12, 2013 @ 5:30 pm

    It may be worth commenting that NLP here means Natural Language Processing, not Neuro-Linguistic Programming. Recalling the Agatha Christie Code pseudo-research a while back, I was about to leave the article "laying in the same position" until I realised this.

    {(myl) You're certainly right about the possible Natural-Language-Processing-vs.-Neuro-Linguistic-Programming confusion, and you're also right that this is about NLP in the Natural Language Processing sense.

    But I'm pretty sure that the Agatha Christie Code episode, though definitely louche, was also free of associations with Neuro-Linguistic Programming. (See here and here for discussion.) It's true that the BBC story originally led with the assertion that ""A neurolinguistic study of more than 80 of her novels concluded that her phrases triggered a pleasure response", but I believe that this was just confusion on the part of the characteristically clueless BBC news department — in any case, it's since been corrected to read simply "A linguistic study of more than 80 of her novels …"]

  4. David Golumbia said,

    September 12, 2013 @ 6:05 pm

    I frankly do not think the critique is strong enough. First, it's a study of films, not literature, and so the apposite discipline is film studies and not literary studies. Second, it's not as if the authors take an outdated scholarly model of narrative if we use that general term to encompass both film and literature): rather, they create their own completely ad-hoc model based on their own intuition and then ground it (very loosely) in figures irrelevant to the scholarly conversation. I can't agree at all with Ted that this is a useful tool not applied quite right, because the entire way in which the problem is defined and which the tool is fashioned to address is not a problem found in the scholarly literature–at all. Sitting (mostly) on the literary side of things, I am fairly impatient with literary scholarship that takes on bits of linguistic work without being responsible toward the underlying disciplinary procedures; I am no less impatient with the relationship going the other way. Reading this paper you'd have no idea there was a field called film studies, and it does not address anything close to a live research question in the study of film. Failing to understand (or to admit) that other fields exist is bad research form, no matter which direction it comes from.

  5. David Golumbia said,

    September 12, 2013 @ 6:24 pm

    i raise the film point because Northrop Frye is not typically read in film studies, even if one grants the tenuous connection between Frye and archetype theory. No doubt there are connections between narrative in film and literature, but this too is a point of research inquiry that has no settled answer. My point is that referencing Frye gives the paper makes it seem as if the paper *might* be grounded in a research perspective it does not name; even that concession is just not correct when seen from the relevant discipline of film studies.

  6. Ray Girvan said,

    September 12, 2013 @ 6:44 pm

    @myl: was also free of associations with Neuro-Linguistic Programming

    The actual computational text aspects were, but the documentary did call on Neuro-Linguistic Programming exponents to frame and explain the results.

    [(myl) Wow. Is this marvel preserved on the internet somewhere?]

  7. elessorn said,

    September 12, 2013 @ 9:31 pm

    Let me (myl) be the first to make the obvious comment that this is very much like the debate about "universal grammar" vs. "analyze each language in its own terms, without eurocentric (or other) preconceptions".

    It may indeed be structurally similar, especially since the concerned audiences are contemporary, colocated, and partially overlap. But isn't there a fundamental diffrence involved here? The range of practical constraints limiting the diversity of possible human language mechanics is surely far severer (and more quantifiable) a determinant than anything involved in the production and development of character archetypes. At least the degree of cross-resemblance among human languages, as opposed to that among human cultures, seems to dovetail with an intuition that the latter enjoy quite a lot more freedom for individual differences.

    A quibble? Perhaps, but then category errors are at the heart of the dispute, and color even the careful take of the article above. The point about the problem of ignoring prior scholarship is, I think, spot on, but I'm not sure it will convince anyone whose hand already reaches for the computational hammer to solve any problem in the first place. The real epistemological bottleneck here seems to me a much more fundamental type of scientism that the the article (unconsciously?) shares–the idea that terms anchor dependably to things, that, just as mercury is mercury in Kansas or on Mars, "villain" or "flirt" are likewise sort of "out there," synchronically between cultures but also diachronically within a culture across time. Only such a (demonstrably shaky) category fusion can explain such methodological confidence, I think. It boils down to a basic lack of healthy respect for the complexity of human systems–not ignorance thereof, which human researchers can hardly plead with any seriousness.

  8. Garrett Wollman said,

    September 12, 2013 @ 10:55 pm

    (Disclaimer: I work with a bunch of NLP people. I have no idea what their politics is, but given that I work for a major US research university one can probably guess.)

    I wonder if the disconnect between the humanists and the NLP people shown above is anything more than the fact that the NLP people, trained as scientists and engineers, are socialized to believe that there is some manner of discoverable, objective fact "out there" in the data, and thus aren't particularly interested in whether their research supports the political agenda ascribed above to modern humanities researchers. The disinterest of humanists in the proposition the scientists wish to consider may have no bearing on its interest as a falsifiable hypothesis about the world we live in.

  9. Ted Underwood said,

    September 12, 2013 @ 11:16 pm

    I confess the criticisms here mystify me a little. It's true that the ACL paper describes character types as if they were universals rather than historical phenomena. And it's true that using Wikipedia poses a problem.

    But what I see there is a huge opportunity. The moment I get a couple of months free, I want to use this technique to explore the historical variation of character patterns in fiction. In short, it's a huge research opportunity that Bamman et. al. are handing us.

    I've seen bad quantitative work on literary history, and when I see it I call it out. But this is work with real potential. Complaining that computer scientists haven't used this technique the way we would use it is … I don't know … like complaining that the Wright Brothers didn't invent FedEx.

  10. Bill Benzon said,

    September 13, 2013 @ 5:15 am

    Um, err, a couple of quick remarks.

    1. FWIW, here's how David Hays and I saw "Computational Linguistics and the Humanist" back in the ancient days of 1976. For those who don't know, Hays was one of the founders of computational linguistics. Back then the "systems of power and oppression" approach to literary studies was still new and exciting. It's now a bit old and tired.

    2. Northrop Frye is best-known for his Anatomy of Criticism which, despite its title, was really an anatomy of literature. It was published in 1957, the same year as Syntactic Structures. Here's a quick and crude chronology that runs cognitive science in parallel with literary theory from the 1950 up through the 1990s or so.

    3. Narratology would be a better reference point for NLP than current literary criticism. Try, for example, David Herman, Story Logic: Problems and Possibilities of Narrative (2002). Patrick Colm Hogan, The Mind and Its Stories (2003) is worth a quick spin, though its methodology is, shall we say, rather loose for the claims it makes.

    4. As for the Wikipedia, let me just repeat a remark I made at Ted Underwood's joint:

    You’ve already raised one issue about working from Wikipedia plot summaries. And, having spent some time reading at Wikipedia plot summaries (mostly film and TV) for various purposes, I’d say that there are problems of completeness and accuracy, which are, in any case, fuzzy notions. There may also be problems of consistency, different editors using different terms for the same thing, though some of that is likely to be, or could be, compensated-for by suitable techniques.

    The thing is, those plot summaries were not, for the most part, written by trained professionals. They were written by ‘civilians’ interested in movies. And the same goes for the entries at TV Tropes.

    Now, I could go on to complain about this and insist that they work only from materials prepared by trained professionals. But that’s not where I’m going. For one thing, it will be a cold day in Hell before academic film critics do that sort of thing.

    There is such a thing as citizen science, in which non-professionals with particular interests engage in collaborations with professionals. Large chunks of biology have been built on the work of amateur naturalists, who continue to do important (largely observational) work. Bird watchers are the most obvious example. I’d think that amateurs are doing important work in tracking the problems with bees, and, for that matter, bats. Amateur astronomers are also important.

    All those Wikipedia plot summaries and all the stuff at TV Tropes, that’s all, in effect, citizen cultural criticism. How can we make use of all that free and interested intellectual labor? It seems to me that THAT question gets pretty near the core of what’s going on, or what could be going on, in college and university education and online courses and digital humanities.

    At least some of the students who come through undergraduate courses in literature and film are the sorts of people who make Wikipedia plot summaries and who contribute to TV Tropes and they’ll be doing that even when they’ve graduated. Is there anything we can teach them that will help them to do a better job? For that matter, do we know anything ourselves about doing that kind of work well? How many of us have done plot summaries?

  11. Bill Benzon said,

    September 13, 2013 @ 5:19 am

    Oh, and here's my take on the current Pinker discussion.

  12. Julian Brooke said,

    September 13, 2013 @ 12:03 pm

    Just a heads up to those who are interested, there is actually a workshop at the North American version of the ACL conference (NAACL) on computational linguistics and literature, which has happened for the last two years:

    https://sites.google.com/site/clfl2013/

    I'm a computational linguist and we (me, my supervisor, and a colleague in the English department at our university) had a paper at the most recent iteration talking about exactly this problem (the gap between computational linguistics and the humanities). We've been doing some recent joint projects which we've been presenting at both computational linguistics and literary theory/digital humanities venues:

    http://www.hedothepolice.org/
    http://www.brownstocking.org

    Sorry for the blatant self promotion, but it's important to note that the recent ACL paper, though interesting, is not exactly the first intersection of these two fields, it's been in the works for a while.

  13. Peter Buchanan said,

    September 13, 2013 @ 12:51 pm

    I'm not quite sure why David Golumbia says that Northrop Frye's connection to archetypal theory is tenuous. Frye uses the term frequently and often in discussing his theory that "certain themes, situations, and character types, in comedy let us say, have persisted with very little change from Aristophanes to our own time." However, Frye was explicitly not applying Jung's specialized use of the term archetype, nor did he subscribe to the idea of the collective unconscious. His idea that these themes, situations, and character types are the building blocks of narrative is certainly the assumption that the authors of the computational paper make, even if they do so without an awareness of Frye's work.

    As for the statement that the relevant discipline is film studies and not English, I would refer him to the opening page of Bamman et al., "We present a complementary perspective that addresses the importance of character in defining a story. Our testbed is film." Given that they are using Wikipedia plot summaries as the raw data, I don't really see why the exact same study could not be replicated with reference to novels or plays, and the authors explicitly state this at the end of the article. The only difference would be that they had different tables at the end. They are interested in story, regardless of medium, and I suspect they chose film because wikipedia editors have written more plot summaries for films than novels or plays. The solution here seems not to be to reject literature people and claim this issue for media studies, but to recognize that lit, film, drama, and media studies people, all of whom have discrete and overlapping interests, have a stake in this discussion.

    As for Alpert-Abrams and Garrette's contribution, I have a few thoughts.

    1) Given that it is a piece arguing for more interdisciplinary work, I found myself wishing that there had been more from Garrette. It read like something written wholly by Alpert-Abrams after she took Garrette out for coffee, down to the singular pronoun she uses in identifying herself as a literary scholar. Why not a "we" on a jointly written piece?

    2) There needed to be more engagement with Bamman et al.'s results. There's something fascinating stuff there that would fit right in with the kind of critique Alpert-Abrams says is missing. For example, COMMAND, DEFEAT, CAPTURE gives the characters Zoe Neville (I Am Legend), Ursula (The Little Mermaid), and Joker (Batman), along with the feature male. Or look at FLIRT, FLIRT, TESTIFY, where two out of the three characters are men (Mark Darcy and Jerry Maguire), with the term Female. Same thing with REPLY, TALK, FLIRT and Graham from the Holiday. The travesty of the article is that Bamman et al. spend pages explaining their methodology, and then don't interpret the results at all. What do these discrepancies say about how Wikipedia editors gender their language or about how certain film genres gender character types. The possibility of this analysis is only hinted at in the last sentence of the article.

    3) I think Alpert-Abrams is right that the piece would benefit from a more serious engagement with current issues in the humanities, but like others I wish she had identified the field of narratology as something that is both very hot right now and also closely aligned with the article's interests. Honestly, I'd be curious to know what the status of TV Tropes is in the narratology community. I feel like it's also important to recognize the disconnect between some elements of what's current in literary studies against what regular readers are interested in, and recognize that a focus on the latter does not necessarily have to disadvantage the former. Normal readers love all sorts of things like biography, archetypal/tropaic criticism, and discussions of the genius of particular authors, in spite of the fact that all of these are often viewed as out of date by many literary scholars.

    4) I think Alpert-Abrams's ending note that these problems could be resolved by taking a humanities scholar for coffee is a bit unfortunate given that the acknowledgments thanks a student at the CMU School for Drama.

  14. Response on our movie personas paper | AI and Social Science – Brendan O'Connor said,

    September 13, 2013 @ 4:58 pm

    […] interesting critique of my, David Bamman's and Noah Smith's ACL paper on movie personas has appeared on the Language Log, a guest post by Hannah Alpert-Abrams and Dan Garrette. I posted the following as a comment on LL. Thanks everyone for the interesting comments. […]

  15. Brendan O'Connor said,

    September 13, 2013 @ 5:01 pm

    Hi — (I cross-posted the following response on my blog too),

    Thanks everyone for the interesting comments. Scholarship is an ongoing conversation, and we hope our work might contribute to it. Responding to the concerns about our paper,

    We did not try to make a contribution to contemporary literary theory. Rather, we focus on developing a computational linguistic research method of analyzing characters in stories. We hope there is a place for both the development of new research methods, as well as actual new substantive findings. If you think about the tremendous possibilities for computer science and humanities collaboration, there is far too much to do and we have to tackle pieces of the puzzle to move forward. Clearly, our work falls more into the first category — it was published at a computational linguistics conference, and we did a lot of work focusing on linguistic, statistical, and computational issues like:

    * how to derive useful semantic relations from current syntactic parsing and coreference technologies,
    * how to design an appropriate probabilistic model on top of this,
    * how to design a Bayesian inference algorithm for the model,

    and of course, all the amazing work that David did in assembling a large and novel dataset — which we have released freely for anyone else to conduct research on, as noted in the paper. All the comments above show there are a wealth of interesting questions to further investigate. Please do!

    We find that, in these multidisciplinary projects, it's most useful to publish part of the work early and get scholarly feedback, instead of waiting for years before trying to write a "perfect" paper. Our colleagues Noah Smith, Tae Yano, and John Wilkerson did this in their research on Congressional voting; Brendan did this with Noah and Brandon Stewart on international relations events analysis; there’s great forthcoming work from Yanchuan Sim, Noah, Brice Acree and Justin Gross on analyzing political candidates’ ideologies; and at the Digital Humanities conference earlier this year, David presented his joint work with the Assyriologist Adam Anderson on analyzing social networks induced from Old Assyrian cuneiform texts. (And David’s co-teaching a cool digital humanities seminar with Christopher Warren in the English department this semester — I’m sure there will be great cross-fertilization of ideas coming out of there!)

    For example, we've had useful feedback here already — besides comments from the computational linguistics community through the ACL paper, just in the discussion on LL there have been many interesting theories and references presented. We've also been in conversation with other humanists — as we stated in our acknowledgments (noted by one commenter) — though apparently not the same humanists that Alpert-Abrams and Garrett would rather we had talked to. This is why it's better to publish early and participate in the scholarly conversation.

    For what it's worth, some of these high-level debates on whether it's appropriate to focus on progress in quantitative methods, versus directly on substantive findings, have been playing out for decades in the social sciences. (I'm thinking specifically about economics and political science, both of which are far more quantitative today than they were just 50 years ago.) And as several commenters have noted, and as we tried to in our references, there’s certainly been plenty of computational work in literary/cultural analysis before. But I do think the quantitative approach still tends to be seen as novel in the humanities, and as the original response notes, there have been some problematic proclamations in this area recently. I just hope there’s room to try to advance things without being everyone’s punching bag for whether or not they liked the latest Steven Pinker essay.

  16. Bill Benzon said,

    September 14, 2013 @ 5:16 am

    That seems about right to me, Brendan. Interdisciplinary work has to be, well, you know, inderdisciplinary. Division of intellectual labor. Each one takes a chunk of the elephant and goes to work on it, doing what he or she does best. The trick, of course, is to reassmble the elephant at the end.

    Some time ago, in connection with curriculum design, I took a look at the human sciences and concluded that there were rougly three broad conceptual styles: 1) qualitative: interpretive/hermeneutic and narrative, 2) behavioral or social scientific, with an emphasis on statistically controlled observations, and 3) structural/constructive: linguistics, cognitive science, where the idea is to construct a grammar or machine that generates the observed behavior. The humanities concentrate on the first and almost all interdisciplinary work in the humanities is confined to qualitative disciplines. Thus literature and psychology is mostly qualitative psychology. Classically, if I man, that's Freud or Jung. More recently there's been interest in cognitive science, but almost entirely in the qualitative side of cognitive metaphor, conceptual blending, other minds, and such. All of the interdisciplinary work associated with Theory, so-called, is qualitative.

    But NLP is in the other two camps. The data gathering, preparation, and analysis is statistical; but there is, I believe, an underlying motivation in the structural/constructive camp. Working across the boundary between the qualitative methods of traditional humanities and the more mechanistic methods in the other two camps is much harder. That will require cooperation between researchers, some of whom have internalized qualitative methods, while others have internalized mechanistic methods. Getting that discourse up and running, that's where the real excitement and deep intellectual potential lies.

    Also, I would urge you to read that 1976 review that Hays and I wrote, mostly just to see how very much different things were back then. The really hot things in computational linguistics (as it was called) were the ARPA (now DARPA) Speech Understanding Project and the plans/scipts/story work coming out of Roger Schank's lab a Yale (and similar work elsewhere). The statistical techniques that dominate NLP these days didn't exist back then, nor did the computational horsepower and data sets on which they depend. But those hot research areas? They bottomed out within a decade. In the middle 80s there was talk of AI winter (which I mention in that chronology).

    The way I see it, the big CL task for the next decade or two is to integrate the statistical methods of NLP with the older symbolic computation methods of the 60s and 70s.

  17. Lauren said,

    September 14, 2013 @ 7:25 am

    I found this very interesting as someone who has some interest and experience in the intersection of the hard sciences with the humanities. Certainly the central thesis that one needs to truly understand both in order to make the most useful connections, rather than pass ideas from one into the other, is true. I think in part this problem comes from the societal systems that are currently in place, however, which very much tend towards specialization. Even a computer science major in college with a minor in film (or linguistics, for that matter) is unlikely to be versed enough in both disciplines to engage with the current work being done in the two fields. It's difficult enough on its own even under optimal circumstances, but I feel that the way we tend both to educate and to assign jobs does not tend to produce people who are capable of doing what the authors of this piece intend.

    I was also reminded by the appeal to the antiquity of Campbell and Jung that these ideas haven't really fallen out of favour so much as they've transformed. You can see a significant example of this in the form of TV Tropes, which immediately comes to mind. Of course, that community has its own issues and many of the same biases cited with respect to Wikipedia, but it's producing discourse today on a massive scale similar in direction to the Campbell/Jung archetype theory that the authors here feel is outdated. It would, of course, be easy to see this merely as an example of ideas filtering downward out of academic discourse into popular culture and its offshoots over time, but I think this would be misleading. Certainly, the idea that academia is somehow on the "cutting edge" any more than TV Tropes betrays its own kinds of biases.

  18. Chris said,

    September 14, 2013 @ 9:29 am

    I am sympathetic to both sides (having lived in both worlds), but this really strikes me as a clash between publishing cultures. Engineering has a tradition of publishing proof of concept papers, Humanities does not. O'Conner addresses this above, but offers no solution except saying essentially that "our tradition is better" (a biased paraphrase of "it's most useful to publish part of the work early and get scholarly feedback, instead of waiting for years before trying to write a "perfect" paper").

    As I tried to explain in my own response, showing such blatant disregard for the differing goals and culture of the very humanities scholars they're trying to develop a method for will not win them many friends in English and comparative literature departments.

    Ted Underwood (above) is an exception. He is already sympathetic to NLP and digital humanities. He needs no convincing.

  19. Ted Underwood said,

    September 14, 2013 @ 10:35 am

    I want to nod to what Lauren said above about a version of archetypal criticism persisting in TV Tropes; I think she's quite right. When we humanists say confidently "we no longer do it that way," you have to take our confidence with a grain of salt. In the humanities, old approaches are rarely totally discarded.

    In response to Chris, I would just say that I'm not sure we can ever expect papers in one discipline to immediately persuade many scholars in another. E.g. David Blei's work on topic modeling was not immediately greeted by acclaim in the humanities. It's rare to get any response at all, frankly, because disciplines rarely pay attention to each other, let alone understand each other.

    You've got a case here where a paper in computer science is getting attention from humanists. Some (like me) are enthusiastic about the potential of the underlying method, and intend to explore it. Some think it's a basic epistemological mistake. Some (like Alpert-Abrams) are intrigued by the method, but stress that it should be used differently. If a paper of mine stirred up that kind of controversy in another discipline, I'd be thrilled.

  20. Dan Garrette said,

    September 14, 2013 @ 1:57 pm

    I'm really excited to see the thoughtful responses to our piece. This is exactly the kind of discussion we were hoping for.

    Peter Buchanan pointed out that the piece has two names on it, but only one voice. This was intentional because, in a conversation that seems to be dominated by scientists, we felt that the response should feature the perspective of the literary scholar. But he asked about my thoughts, and so I'd like to share a few things.

    When I first mentioned the Bamman et al. article to Hannah, we spent a lot of time talking about how our respective fields view scholarship. In particular, it is not unusual for NLP researchers to work incrementally and iteratively: first we build something cool, then we publish and get critical feedback, which we use to improve our models, and eventually we try to figure out how to adapt them to address larger problems. This is something that David Bamman brought up when we discussed our ideas directly with him and it's something that I, as someone who works in NLP, am sympathetic to. Sometimes we just want to explore new ideas and push research boundaries, and we don't always know where our methods will take us.

    But our concern was never with the research methods employed. Instead, what caught our attention was what we viewed as an attempt to place the work in the realm of literature. Both David Bamman and Brendan O'Connor have stated that their work was not intended as a contribution to literary study. My question, then, is: Why not? Given that they are interested in designing models to analyze literature, why not seek out the questions of contemporary literary (or film) study, and attempt to address those? I hope that this discussion doesn’t come off as me saying "you should have run experiment X instead of Y". Our critique is really that when we begin a research project, we have to decide which questions we would like to address. When we’re working in the domain of another discipline, picking a question from an area of active scholarship gives us the chance to enter into a mutually beneficial collaboration.

    This isn't to say that computer scientists can't ask their own questions, and when it comes to designing models, we are the ones who best understand the capabilities of our research. I also appreciate Ted Underwood's argument that humanists should be excited about the development of new methods that they can run with. But, I would like to emphasize that whether it's with literary scholars, linguists, or anyone else, it is less effective to do research based on our own intuitions, to hand it over, and expect them to find a use. We have to recognize that these fields existed long before we arrived, and that our collaborators know their fields better than we ever will — a discussion that Ben Zimmer commented on with regard to physicists attempting linguistic research. We are not here to lay the groundwork for other fields. They have already laid the groundwork, and now is the time for us to work together to build upward.

  21. Peli Grietzer said,

    September 14, 2013 @ 2:17 pm

    The response's specific analysis of the ACL paper's methodology are spot-on, but it's painfully ironic that the authors cite trends in the last 20 years of literary study in Anglo-American universities as proof of what literary scholars have 'discovered' and of what question were transcended and replaced with more important ones. If the authors ever had a coffee with literary scholars outside English and Romance Languages departments in the United States and UK they would have been aware that many, many literary scholars all over the world are extremely interested in building textual models. And, regardless of your interests, you can only 'move on' from being interested in building textual models to being interested only in how they reflect power-structures if you've done such a tremendously reliable job at building a method for the production of textual models that it's become perfunctory.

    Also, it's really unfortunate that a paper warning about cultural blindness equates 'rarely cited today' with 'rarely cited in authors' immediate academic social circle':

    http://scholar.google.com/scholar?as_ylo=2009&hl=en&as_sdt=5,33&sciodt=0,33&cites=11268294370177128744&scipsc=

  22. Hannah Alpert-Abrams said,

    September 14, 2013 @ 6:13 pm

    Thanks to everyone for a lively conversation. I couldn’t be more pleased.

    In particular, thanks to those humanists who drew attention to “film studies” and “narratology” as potential sites of research. I also respectfully concede Peli Grietzer’s point that lots of literary scholars love and use Northrop Frye. I apologize for trying to speak for all scholars everywhere, and falling short.

    In response to Ted Underwood, let me just say that I respect your work immensely and am proud to be on the opposite side of a debate from you.

    Without attempting to defend myself, I would like to briefly address one point of criticism. Chris argues that computer scientists and humanists have been speaking at cross-purposes in this conversation, and points out that computer scientists have a different model of publication. Brendan O'Connor makes this same point. They are correct, and I am actively working to become more fluent in the discourse of computational linguistics.

    When I imagine what interdisciplinary work looks like, however, I envision something that can be read and engaged with across disciplines. Humanists who utilize machine learning have already incorporated certain aspects of the discourse into their writing, by providing detailed accounts of their methods and results. It is not inappropriate to expect that contact with the humanities would have a similar contaminating effect on computer science. In fact, that kind of hybridization might even be desirable.

    The humanist’s discomfort (I should say, my discomfort) with the separation of data and model is particularly fruitful here, I think. Our data shape our models in profound ways. Whether we use survivor testimonies to do part-of-speech tagging, bibles to do translation studies, or Wikipedia to build character models, the nature of the corpus has a determining effect on the kinds of models that are produced and the conclusions we can draw from those models. For this reason, it is useful and often imperative that we take our sources into account from the beginning.

  23. A comment on "Computational linguistics and literary scholarship" | Will.Whim said,

    September 14, 2013 @ 7:29 pm

    […] is a controversy over "Computational linguistics and literary scholarship." A paper published at the Association for Computational Linguistics David Bamman, Brendan […]

  24. Brendan O'Connor said,

    September 15, 2013 @ 12:06 am

    Dan –

    Both David Bamman and Brendan O'Connor have stated that their work was not intended as a contribution to literary study. My question, then, is: Why not?

    Why is it not intended as a full substantive contribution?

    Because research is hard, and it's harder to do two things at once instead of one. I think it's better to either (1) focus on methods while being somewhat informed by substantive issues, or (2) focus on substantive while utilizing well-proved methods.

    When I was younger and more naive, I thought that great interdisciplinary scholarship should be cutting-edge on both sides at once — as you and Hannah Alpert-Abrams seem to demand. I now think that is extremely hard and unrealistic to expect in all cases. Maybe you two will accomplish this. That would be great! Best of luck.

  25. Rubrick said,

    September 15, 2013 @ 1:32 pm

    @Hannah Alpert-Abrams: "Our data shape our models in profound ways."

    I think this bald assertion epitomizes the difference between humanities scholarship and scientific research (at least, scientific research done right). When I read this, I immediately think: "Is this actually true? Can you prove this? How would one design a study exploring the relationship between data sets and the models that emerge from them? What direction does the causality actually run?" In other words, don't just tell me that this is true; show me that this is true.

    To be perfectly clear: I believe that this is true. But belief isn't enough. I think the impatience of many scientists with non-science scholars stems from the perception that they present a hypothesis, perhaps supported with some cherry-picked examples, and then move on, leaving out the really hard work of trying to show that the hypothesis actually holds.

  26. Ted said,

    September 15, 2013 @ 3:45 pm

    This has been interesting. They say "don't read the comments," but in this instance I'm glad I did.

    I think Hannah Alpert-Abrams is right to say that "our data shape our models in profound ways." That's often true, and it's a weak point in the metaphors I was using earlier that suggested humanists could just extract a method from a paper and "fly" it someplace new. I think that will be possible in this case, but it's not always possible — so Alpert-Abrams and Garrette have a valid point about interdisciplinarity.

    On the other hand, I absolutely understand why the authors of the ACL paper didn't start with a full-scale question in film history or literary history. As someone who's spent several years trying to extract historically representative corpora of fiction, poetry, etc., I have a painfully vivid sense of how much time it takes just to set up the data for that kind of question. And the NLP problems in novels are going to be much trickier than in plot summaries (… which, on the other hand, feeds back into Alpert-Abrams' point).

    So, I don't know! Interesting conversation. These are hard problems.

  27. Editors’ Choice: Computational Linguistics and Literary Scholarship : Digital Humanities Now said,

    September 17, 2013 @ 11:30 am

    […] Read full conversation here. Category: Editors' Choice […]

  28. David Bamman said,

    September 17, 2013 @ 4:56 pm

    (Just for the sake of archival continuity, my own response can be read in the separate guest post here: http://languagelog.ldc.upenn.edu/nll/?p=7094)

  29. What else does a data scientist need? « Bad Hessian said,

    September 26, 2013 @ 12:30 pm

    […] I began thinking along this front in part because of a discussion that's been happening over at Language Log on computational linguistics and literary scholarship. […]

  30. SDL said,

    October 8, 2013 @ 9:51 am

    Film school grad and NLP student here:

    Film and filmic narrative are distinctly Western phenomena. Cultures that have adopted film have been totally influenced by Western ways of filmic storytelling. I suggest that Ms. Alpert-Abrams start by watching, say, some Kurosawa, then re-watching the Star Wars trilogy. World film is Western film, and vice versa. The character types you'll find in one are the types you'll find in the other. The "don't enforce the white man's way of knowing" critique doesn't work terribly well when it comes to film.

RSS feed for comments on this post