A test kitchen for stylistic recipes

« previous post | next post »

This morning, from the airport in Brussels, I want to following up on our discussion of discourse anaphora ("Why are some summatives labeled 'vague'?", 5/21/2008; "More theory trumping practice", 5/22/2008; "Poor pitiful which", 5/23/2008; "Clarity, choice, and evidence", 5/23/2008), in the spirit of Friday's post about "Prescriptivist science".

There's been quite a bit of research on the phenomena in question. One strand, in computational linguistics, goes back at least to Bonnie Webber's "Structure and ostension in the interpretation of discourse deixis", Language and Cognitive Processes, 6(2):107–13, 1992. A recent addition to this literature is Ron Artstein and Massimo Poesio, "Identifying reference to abstract objects in dialogue", brandial 2006. Their abstract begins:

In two experiments, many annotators marked antecedents for discourse deixis as unconstrained regions of text. The experiments show that annotators do converge on the identity of these text regions, though much of what they do can be captured by a simple model. Demonstrative pronouns are more likely than definite descriptions to be marked with discourse antecedents.

Following Webber, they define "discourse deixis" as "an anaphoric relation … where the reference of an anaphoric expression is present in the preceding text but not in the form of an explicit antecedent". The example that they give is from the TRAINS-91 corpus, involving the demonstrative pronoun that:

7.3 : so we ship one
7.4 : boxcar
7.5 : of oranges to Elmira
7.6 : and that takes another 2 hours

and they observe

The reference of that clearly depends on the preceding text, and in this sense the pronoun is an anaphor. The meaning of that in this context can perhaps be expressed with a nominalization such as the shipping of one boxcar of oranges to Elmira. Such a nominalization is not present in the text—but something very close to it is. …

The references of the anaphors in question are often abstract, and do not necessarily correspond to any particular phrase or clause in the text.

Note that "discourse deixis" covers anaphors of all types, including not only demonstrative and relative pronouns (this, that, which) but also non-pronominal noun phrases like "this idea". Artstein & Poesio observe that this phenomenon is very common — in their study of anaphora in dialogue, about a fifth of all potentially anaphoric expressions are of this kind.

Although I haven't seen any quantitative comparisons, I believe that such discourse anaphora are substantially more frequent in speech than in writing. This is not just a fact about contemporary English; thus George Hinge writes about classical Latin ("Epistolary deixis in the correspondence of Fronto and Marcus Aurelius"):

The demonstrative pronouns … are more frequent in some genres than in others, the extremes being historiography with 0.9% and comedy with 4.7%, i.e. more than five times as many. The frequent use of demonstrative pronouns is evidently characteristic of the colloquial language… It renders the style more vivid, but at the same time also more context-bound and therefore less elevated. In English, too, the pronouns this and that are much more frequent in the colloquial language than in written prose, and I suspect that has similar stylistic implications.

Searching the NYT archive for strings like "this is" or "this may be" , I get the impression (without trying to count) that a large fraction of the hits are in quotations, or in articles written in a conversational style. For example: Ron Lieber, "Negotiating for a House? Start With ‘Dear Seller’", NYT, 5/31/2008:

So buyers have options right now. A lot of them. I’m no different. Your home is great, but it isn’t unique. Few homes are. I know this may be hard to hear, since you’ve spent years creating memories here. But you may be waiting a long time if you hope to find a buyer with the same emotional connection that you have. [emphasis added]

It's important to note that discourse anaphors can be regular pronouns like it, or full noun phrases like "that sort of discussion" — Lieber's article continues:

Has your real estate agent laid any of this out for you? Maybe so, and you didn’t want to believe it. But it’s also possible that your agent, afraid of offending you and losing the listing, simply doesn’t want to initiate that sort of discussion.

And it's also important to note that discourse anaphora is not by any means limited to speech or to informal writing, and even discourse-anaphoric demonstrative pronouns are often found in formal texts. I previously linked to Bertrand Russell's "Problems of Philosophy", whose second paragraph features several examples:

In daily life, we assume as certain many things which, on a closer scrutiny, are found to be so full of apparent contradictions that only a great amount of thought enables us to know what it is that we really may believe. In the search for certainty, it is natural to begin with our present experiences, and in some sense, no doubt, knowledge is to be derived from them. But any statement as to what it is that our immediate experiences make us know is very likely to be wrong. It seems to me that I am now sitting in a chair, at a table of a certain shape, on which I see sheets of paper with writing or print. By turning my head I see out of the window buildings and clouds and the sun. I believe that the sun is about ninety-three million miles from the earth; that it is a hot globe many times bigger than the earth; that, owing to the earth's rotation, it rises every morning, and will continue to do so for an indefinite time in the future. I believe that, if any other normal person comes into my room, he will see the same chairs and tables and books and papers as I see, and that the table which I see is the same as the table which I feel pressing against my arm. All this seems to be so evident as to be hardly worth stating, except in answer to a man who doubts whether I know anything. Yet all this may be reasonably doubted, and all of it requires much careful discussion before we can be sure that we have stated it in a form that is wholly true.

In my opinion, only someone who is being deliberately obtuse could claim not to understand the various examples of discourse deixis presented above.

But in Chapter IV of The Elements of Style ("Words and expressions commonly misused") Strunk & White wrote:

This. The pronoun this, referring to the complete sense of a preceding sentence or clause, can't always carry the load and so may produce an imprecise statement.

Visiting dignitaries watched yesterday as ground was broken for the new high-energy physics laboratory with a blowout safety wall. This is the first visible evidence of the university's plans for modernization and expansion. Visiting dignitaries watched yesterday as ground was broken for the new high-energy physics laboratory with a blowout safety wall. The ceremony afforded the first visible evidence of the university's plans for modernization and expansion.

In the lefthand example above, this does not immediately make clear what the first visible evidence is.

It's true that the left-hand example doesn't specify whether the first visible evidence is the new laboratory, or its safety wall, or the ground-breaking ceremony, or perhaps the abstract process that starts with the ceremony and ends with the safety-walled laboratory. Still, these meanings are all so closely intertwined that the vagueness is arguably no barrier to understanding; and the right-hand example arguably violates Strunk's meta-maxim "Omit needless words".

Let's imagine that another usage guide (Strunk & Black?) uses the same examples to illustrate the opposite conclusion: we should avoid pompous and wordy phrasing like "the ceremony afforded", preferring plain and direct language like "This is". How should we decide whose advice to follow?

The traditional approach is to resolve such conflicts by the techniques of the playground. "It's unclear!" "No it isn't!" "Is!" "Isn't!" — "It's pompous and distracting!" "It's not!" "Is!" "Not!"

This is fine if the only issue is whose ego is stronger. But if we really care about the truth of the matter — the distribution of alternative choices in good and bad writing, and the effect of these choices on readers or listeners — there are several techniques for exploring such questions empirically.

And in fact, a systematic effort of this general kind is underway, though the application to stylistic advice is so far only an indirect one.

Artstein and Poesio's work is part of a project on "Anaphora Resolution and Underspecification" (ARRAU) at the Universities of Essex and Glasgow in the UK. The research on "singular they" by Anthony Sanford and Ruth Filik that I discussed earlier also comes out of this project ("'They' as a gender-unspecified singular pronoun: eye tracking reveals a processing cost", Quarterly Journal of Experimental Psychology, 60(2) 171-178, 2007).

The ARRAU project describes itself as follows:

  • Objective
    • To study the cases of anaphoric reference most problematic for current anaphora resolution systems, in particular:
      • Reference to plurals
      • Reference to abstract objects such as events and plans
      • Ambiguous anaphoric expressions
  • Methods
    • Corpus analysis and annotation
      (University of Essex)

      • Identify questions to be studied through psychological studies
      • Produce an annotated corpus
    • Psychological experiments
      (University of Glasgow)

      • Provide empirical evidence to be used to modify our existing anaphora resolution system which will be evaluated using the annotated corpus
  • Theoretical goals
    • Revise current models of the interpretation of plurals and events
    • Develop a model of reference to plans
    • Evaluate hypotheses about underspecification in reference to structured objects
    • Yield a better understanding of the phenomenon of (semantic) underspecification
  • Thus ARRAU's practical goals involve "anaphora resolution systems", which are computer algorithms that try to assign co-reference in texts and transcripts, and its theoretical goals involve the psychology of language and communication. The development of advice for writers and speakers is not foregrounded, or indeed mentioned at all on this page. Stylistic prescriptions could be tested using the techniques and models developed in research of this type, but so far, this is not happening to any significant extent.

    I shouldn't try to speak for the people involved, but I suspect that the ARRAU project's perspective is equally due to the interests of research funders and those of the research community. Funders (and researchers too) want better computer text "understanding"; researchers (and also funders) want to understand how people create and interpret language. Neither group is especially interested in what Plato called the "cookery of the soul". (Well, to be more exact, he has Socrates speak disdainfully of "rhetoric, which is, in relation to the soul, what cookery is to the body".)

    And no doubt the nutritional content is more important than the presentation, in the end. But in our current culture, we've left the "cookery of the soul" in the hands of self-appointed experts whose recipes are wholly notional, never tested in any real kitchen. There's no longer any reason for this.

    In 1918, when Willie Strunk published his little book, it would have been difficult at best to measure readers' reaction times, and locating eye fixations accurately in space and time was essentially impossible. Even in 1959, when E.B. White added the entry for this quoted above, these techniques were not commonly available. Today, any computer can present speech or text to a subject and measure reaction times; dozens (hundreds?) of psycholinguistic laboratories have easy-to-use eye-tracking systems; and other techniques are widely available as well, including ERP and fMRI. Just as important, there are reasonably good ideas about how to use such measures to study resource constraints in the comprehension of speech and text.



    5 Comments

    1. Mike said,

      June 1, 2008 @ 3:32 am

      How does one use a blowout safety wall to break ground? Strunk doesn't explain that!

    2. John Cowan said,

      June 1, 2008 @ 3:54 am

      Strunk is innocent this time, as he often (but not always) is; the section quoted above is White's work entirely.

    3. Peter said,

      June 1, 2008 @ 6:35 am

      Bertrand Russell wrote in his Autobiography that he normally wrote by longhand, quicky and fluently, and only rarely revising his words, merely re-reading them to delete nearby duplicate terms or expressions. That may be why much of his prose reads as if it had been spoken.

    4. Tim Silverman said,

      June 1, 2008 @ 10:22 am

      I see two basic problems with this approach.

      The first is that different readers' requirements are different. What one reader sees as turning from an incomprehensible bog of adjectives to a passage that focuses properly on the main action, another reader will see as turning from a vivid, richly descriptive passage to a skimpy, thin, poorly realised one. What one reader sees as turning from a passage rendered literally unintelligble by its extreme indirectness and allusiveness to a passage that clearly states what it means in simple, plain language, another reader will see as turning from a transparent, though subtle, passage to one that stands on your neck while it repeatedly hammers the blindingly obvious into your head with a large mallet.

      The second problem is that constructions like summative demonstratives do not exist in isolation; each instance of a construction sits within a web of other constructions realising other stylistic choices. Whether the referent of "this" is obvious depends heavily on what that referent actually is and how it is expressed. Comparing the comprehensibility of "this" with "this thing" is meaningless without controlling for the type of referent; but at this stage—as far as I know—we aren't in possession of a classification of referent types which would enable to build such a control.

      Furthermore, I'm not convinced that measuring reaction times is going to be more for studying this particular problem than simply asking people whether they liked and/or understood a passage of writing. Readers aren't, as a rule, very good at judging why they disliked a piece of writing; they're much better at judging whether they disliked it.

      And, as a writer, what one is interested in is whether a reader read your book through avidly from cover to cover, or hurled it across the room in disgust halfway down page 6; reaction times and eye movements are only a proxy for that fundamental datum anyway.

      In short, I think it would be better to try to understand what stylistic features (of writing) and reactions (of readers) we are trying to measure, before dashing into trying to construct proxies by which to measure them.

      Not that I'm actually familiar with the psycholinguistic literature on this subject …

    5. Alex Price said,

      June 1, 2008 @ 4:48 pm

      I second Tim Silverman’s comments, but would go a bit further. To me the idea of tracking eyeballs and counting reaction times to justify stylistic choices is so manifestly loony that I have trouble believing Mark Liberman doesn’t have his tongue firmly in cheek. Why? Because stylistic decisions are not generalizable. Isn’t that one of the problems with prescriptive “rules” to begin with? It is ironic that an opponent of prescriptivism would show the same desire to reduce the complexity of written communication and eliminate the vagaries of human judgment. It is reminiscent of the search for an objectively “perfect language” that Umberto Eco documents (in his book The Search for The Perfect Language).

      The relative ambiguity of discourse anaphors is dependent not just on context but also on audience. And a reader may be, for a variety of reasons, more or less tolerant of ambiguity. Consider the example Strunk & White provide for this. Some readers may note the anaphoric ambiguity and others may not. Some readers may read carefully because the information presented really matters to them; others could care less. The eyeballs of readers who don’t care much about what they’re reading may glide happily past ambiguities that would be unacceptable to other readers. Measuring how a passage is read doesn’t indicate what is retained from it. And measuring what is retained, assuming one could find a reliable method to do so, doesn’t take into account factors like reader motivation and interest.

      Mark Liberman judges the Strunk & White example acceptably ambiguous, but notes that others may disagree and deplores “the techniques of the playground” that constitute the “traditional approach” for settling disagreements of this sort. He writes that “if we really care about the truth of the matter,” we’ll undertake the empirical research needed to find it out and notes that the tools for doing so are now available. But the truth of the matter is that there is no truth of the matter. Or rather there are an endless number of particular truths. One would need to separately investigate every use of this as it is received by every reader: this is like something out of a story by Borges. (Perhaps empirical investigation has a place in some cases; I found the singular-they investigations described in an earlier post a bit more persuasive.)

      Rhetoric is the area of study that traditionally deals with questions of style and usage, and I find it interesting that Mark Liberman includes a disparaging reference to rhetoricians in this post. Like Plato, he is bothered by the lack of foundation for rhetorical judgments. But reliance on the judgment of rhetoricians is not reducible to “the techniques of the playground.” Some people’s judgments really are worth more. Some people are experts, whether self-appointed or not. But their authority is not, and cannot be, entirely based on fact and is always contestable. As is the case with, for example, Strunk & White.

    RSS feed for comments on this post