## Coherence of sentence sequences

Here are two successive sentences from The Wizard of Oz, presented in two different orders:

1. "How strange it all is! But, comrades, what shall we do now?"
2. "We must journey on until we find the road of yellow brick again," said Dorothy, "and then we can keep on to the Emerald City."
1. "We must journey on until we find the road of yellow brick again," said Dorothy, "and then we can keep on to the Emerald City."
2. "How strange it all is! But, comrades, what shall we do now?"

The first order (in blue) is easier to construe as a coherent sequence, because in that order, sentence 2 answers a question posed by sentence 1. The version in red could be rescued by a more complicated set of contextual assumptions or a more complicated theory of the interaction — but in fact it's the blue version that's the original.

Some of us have been working on the problem of how to quantify "coherence" (scare quotes intentional). There are lots of potential applications, from writing assistance to clinical diagnosis, but there's an obvious problem: not much data reliably representing "coherent" vs. "not coherent" language. As one possible solution, we've been looking at adjacent sentence-pairs chosen at random from available texts,  on the theory that the sentence-pairs should be more "coherent" in their original order than in a reversed order. (In some sense of "coherent", at least…)

And this turns out in general to be true. I wrote a simple program to divide a text into sentences, another program to choose an adjacent pair of sentences at random, and another one to format the chosen sentences in two different orders as above, deciding randomly whether the original order is the blue or the red version.

Most of the time, it's pretty obvious what the original order was. Here's a random selection from To the Lighthouse:

1. She reduced them to a frenzy of indecision by this interference in their cosmogony.
2. She raised a little mountain for the ants to climb over.
1. She raised a little mountain for the ants to climb over.
2. She reduced them to a frenzy of indecision by this interference in their cosmogony.

In that case, I think it's easier to construe the red version as a coherent sequence. This time it's because of chains of reference — them and their in the second sentence can refer to the ants mentioned  in the first sentence, and the phrase this interference in their cosmogony in the second sentence makes sense as a reference to the mountain-raising described in the first sentence.

That's the sort of thing described in a series of papers on "Centering", starting with Grosz, Joshi, & Weinstein, "Providing a unified account of definite noun phrases in discourse", ACL 1983:

Linguistic theories typically assign various linguistic phenomena to one of the categories, syntactic, semantic, or pragmatic, as if the phenomena in each category were relatively independent of those in the others. However, various phenomena in discourse do not seem to yield comfortably to any account that is strictly a syntactic or semantic or pragmatic one. This paper focuses on particular phenomena of this sort — the use of various referring expressions such as definite noun phrases and pronouns — and examines their interaction with mechanisms used to maintain discourse coherence.

and revised in Grosz, Weinstein & Joshi, "Centering: a framework for modeling the local coherence of discourse", Computational Linguistics 1995:

This paper concerns relationships among focus of attention, choice of referring expression, and perceived coherence of utterances within a discourse segment. It presents a framework and initial theory of centering intended to model the local component of attentional state. The paper examines interactions between local coherence and choice of referring expressions; it argues that differences in coherence correspond in part to the inference demands made by different types of referring expressions, given a particular attentional state. It demonstrates that the attentional state properties modeled by centering can account for these differences.

Sometimes the crucial clue seems to be provided not by the choice of referring expressions, but by a choice of conjunction or other linking expression, as in this random sentence-pair from The General Theory of Employment, Interest and Money:

1. The classical school have tacitly assumed that this would involve no significant change in their theory.
2. But this is not so.
1. But this is not so.
2. The classical school have tacitly assumed that this would involve no significant change in their theory.

Again, both orders can be construed as part of a coherent discourse, but the first one (in blue) is easier. Here a simple consideration of explicit co-reference chains is not enough: each of the two sentences has an instance of the demonstrative this, and so (out of context) one "this" must be unresolved no matter which order we chose. The blue order works better because its first sentence attributes an assumption to "the classical school", while its second sentence asserts that this assumption is false. In the red order, some more complicated contextual logic must be found.

Centering theory also partly explains the factors contributing to greater apparent "coherence" in this random selection from a 1996 NYT story:

1. In their report, the officers said two men approached and berated them for ticketing their car.
2. The men were unsteady on their feet, had slurred speech and smelled of alcohol, the officers said.
1. The men were unsteady on their feet, had slurred speech and smelled of alcohol, the officers said.
2. In their report, the officers said two men approached and berated them for ticketing their car.

In the blue order, the officers are introduced first, as the subject of the main clause; in the second blue sentence, they slip back into an attributional tag. And it might also follow from centering theory that starting the narrative with a reference to "their report" (as in the blue sequence) is an appropriate way to frame what follows. But we need a more general understanding of how such interactions are likely to unfold, in order to grasp the narrative relationship between "the two men approached them" in the first blue sentence, and the catalog of intoxication-related characteristics in the second.

We've found that humans seem in general to be pretty good at choosing the original order in such random sentence-pairs. This is not always true, of course — some pairs seem to be completely ambiguous, and such ambiguities seem to be more common in some authors and genres than others.  But some of the "coherence" evaluations suggested in the literature fail completely at this task, and the best automatic methods we've come up with seem to be only marginally better than chance.

So to get a broader sense of how humans do at this task, I've set up a dozen more-or-less random examples as a Qualtrics survey. Try it out, see what you think, and I'll post the answer key (and the results) tomorrow.

[I've tried to make the test a little more interesting by eliminating examples where obvious name/pronoun pairs ("Dorothy"/"she") or name/description pairs ("Bill Clinton"/"The president") make it too easy. And the survey uses BLACK and RED rather than BLUE and RED as the answer categories…]

This isn't a proper scientific experiment, of course, but it should give us some help in deciding whether the random-pairs paradigm is a useful direction to pursue, or whether we should modify it —  e.g. making the task easier by presenting four sentences, with the order of the middle two as the question to be decided.

Update 4/18/2019 — results and discussion are here.

(Joint work with Reno Kriz, João Sedoc, & Mengdi Huang.)

1. ### Cervantes said,

April 17, 2019 @ 9:28 am

I found 2, 3, 5 and 12 undecidable. I might have had a slight preference for one or another but either seemed entirely plausible.

I would expect this task to be largely unamenable to machine learning as it generally requires semantic understanding.

2. ### Daniel Barkalow said,

April 17, 2019 @ 9:47 am

I think I was able to resolve all of those based on count of referents that are described appropriately. (1) Actions to be taken, question then answer; (2) as you said; (3) Both are referring to something in context, but "But this is not so" doesn't provide anything easy to refer to whereas the other sentence does; (4) "the officers" is okay either way, but "the men"/"two men" is much better one way than the other.

Of course, you have to understand the sentences to know that "two men" are likely the same men as "the men", rather than there being two interactions that happened at about the same time, but once you've worked out the cast, you don't have to understand the likely order of events to order the sentences; the first sentence is the one where one of the common participants is novel.

3. ### Ursa Major said,

April 17, 2019 @ 10:52 am

While some of the sentence pairs took a little more time to consider than others, Q8 is the only one I felt unsure of in the end.

I decided that "it" in both sentences referred to the same thing and "there" in the shorter sentence referred to the shawl mentioned in the longer sentence – the other way around requires two unknowns "it" and the thing "there" that "it" is hiding behind.

I noticed that, in most cases, I was putting the sentences in an order that could be loosely described as general->specific, e.g. Q1 statement of the swindle problem followed by example of one way it is done, Q3 view of many followed by view of one, Q5 statement that there are many contributors followed by example of one of them.

4. ### J.W. Brewer said,

April 17, 2019 @ 12:51 pm

The difficulty I have with the Woolf and Keynes examples is that the two sentences are presumably pulled from somewhere in the middle of a larger discourse, and don't necessarily (if truly selected at random) mark the first two sentences of a new section or line of thought within that larger discourse. The sentences in those two examples that, considered in isolation, need to come second in the interests of coherence are the ones that presuppose by their wording *some* sort of prior set-up (i.e. the set-up which would provide the antecedent for the "their" and "them" and the antecedent-or-equivalent-term for the "this" that isn't so). And the other sentence in those two examples provides a set-up that would make sense. But one can imagine some other sentence occurring just prior to the two-sentence excerpt that would be an equally-or-more-plausible alternative set-up. I imagine there are lots of other situations where an AB ordering of two linguistic elements seems more plausible or natural than a BA ordering when considered in isolation but there may be larger contexts within which a BA ordering makes perfect sense.

5. ### J.W. Brewer said,

April 17, 2019 @ 1:18 pm

In the quiz, Q7 is maybe a good example of what I was trying to describe. The "flushed" sentence is obviously a reaction to being told something positive. It is plausibly a reaction to the other sentence in the example, in which case it would naturally go second, but could also plausibly be a reaction to having been told something else positive just before the excerpt starts, in which case it would go first, with the other sentence in the example then being a follow-up to the earlier pre-excerpt positive sentence that accordingly comes second in the excerpt ("and now let me say something else positive about you, especially since you reacted positively to my first compliment …").

6. ### Andrew Usher said,

April 17, 2019 @ 6:37 pm

I took the quiz, of course, and like Ursa Major above, 8 is the only one about which any doubt remained after I answered. (I think I took the opposite choice as he on that one.)

It's strange that this would be considered a particularly interesting problem, though. Of course speech and writing are normally connected, and therefore our ability to decode meaning should allow us to put things in the right order – why is that surprising? That we sometimes fail due to insufficient context is no more so. I _think_ the problem that prompted this may have been computer language processing, which would better explain it.

k_over_hbarc at yahoo dot com

7. ### John Swindle said,

April 17, 2019 @ 10:00 pm

I had few doubts about Q8 but was in fact wrong both about what "it" referred to (I thought it to be her shadow) and about the order of the sentences, as evidenced by the readily available original. I may never be a famous 19th-century novelist.

8. ### Jon said,

April 18, 2019 @ 12:40 am

The sentence pair below (as near as I can remember it) is from the BBC radio programme 'I'm sorry I'll read that again' in the 1960s:
Yes, I can.
Can you see into the future?

You would need powerful AI to recognise that this is the correct order of sentences.

9. ### rosie said,

April 18, 2019 @ 2:36 am

Here's an example of a series of set-ups which show that, with some sentences, it can be hard to determine which order is the most likely: https://www.youtube.com/watch?v=y0C59pI_ypQ

10. ### jaap said,

April 18, 2019 @ 2:53 am

I found Q3 to be a subtle one. In the end it was just the word "simply" that determined it for me. Without that word it would have been ambiguous.

11. ### Breffni said,

April 18, 2019 @ 6:48 am

"[Broad claim], according to [Source]" is a common enough journalistic cliché for opening sentences that I wonder if its appearance twice in twelve examples might skew the results somewhat. Not a big issue for an informal exercise like this, of course.

Andrew Usher: what's interesting is not the mere observation that we can do it, but the much trickier question of precisely how we do it. And although that question can be sharpened by attempts to model the ability in software, that doesn't mean that the computational challenge is the only reason it's interesting.

Related matters previously on LLOG: Discourse: Branch or tangle?

12. ### Dan Faulkner said,

April 18, 2019 @ 3:20 pm

I'm reminded of the Crab Canon in Douglas Hofstadter's "Goedel, Escher, Bach." It starts out as a dialog between two characters: ABABAB… Halfway through, the conversation is reversed, …BABABA, using the identical lines spoken in the first half. The result is a sort of palindromic conversation that's semantically coherent in both directions.