Proportion of dialogue in novels
« previous post | next post »
For reasons not strictly relevant to what follows, Yves Schabes and I have been analyzing the novels of Agatha Christie. (For the not-strictly-relevant background, see Xuan Le et al., "Longitudinal detection of dementia through lexical and syntactic changes in writing: a case study of three British novelists", Literary and Linguistic Computing 2011, and Graeme Hirst & Vanessa Feng, "Changes in Style in Authors with Alzheimer's Disease", English Studies 2012.)
It occurred to me to wonder whether the proportion of quoted dialogue might vary from text to text — and since the textual properties of dialogue are likely to be different from those of the narrative voice, this might influence the results of comparisons. So I ran a quick check on seven of Christie's novels, using as proxy the proportion of characters in the novels' texts in spans between quotation marks.
The results confirm that from the beginning, a substantial proportion of Christie's texts are dialogue, and that this proportion seems to have increased over the course of her career:
Novel | Date | Total chars | Quote chars | Percent |
The mysterious affair at Styles | 1920 | 320869 | 157781 | 49.17% |
The secret adversary | 1922 | 425692 | 228143 | 53.59% |
The murder of Roger Ackroyd | 1926 | 388173 | 224664 | 57.88% |
Murder at the vicarage | 1930 | 385778 | 228241 | 59.16% |
Nemesis | 1971 | 422657 | 275142 | 65.10% |
Elephants can remember | 1972 | 329490 | 261554 | 79.38% |
Postern of fate | 1973 | 418929 | 319150 | 76.18% |
A relatively high proportion of dialogue is a common characteristic of genre novels — thus Arthur Conan Doyle's 1892 The Adventures of Sherlock Holmes has 46.95% characters within quoted strings; Bram Stoker's 1897 Dracula has 47.46%; G.K. Chesterton's 1908 The Man Who Was Thursday has 60.08%;, and Josephine Tey's 1946 Miss Pym Disposes weighs in at 42.54%.
Not all novels are like this: Thus Virginia Woolf's To The Lighthouse has just 3.27% characters within quoted strings; her Jacob's Room has 10.17%; Ernest Hemingway's The Old Man And The Sea has 15.03%; his The Sun Also Rises has 29.40%; Margaret Atwood's Lady Oracle has 14.01%; Thomas Pynchon's The Crying of Lot 49 has 23.99%.
Some non-genre novels have higher proportions: The Great Gatsby has 51.32%; Martin Eden has 50.58%.
And some genre novels are more narrative-heavy: The Da Vinci Code has 29.59%.
Anyhow, it's clear that Agatha Christie's later works have a substantially higher proportion of dialogue than her earlier works. There could be many reasons for this, including an unexpected connection between cognitive decline and novelistic dialogue. But I suspect that the effect is a natural stylistic progression, perhaps driven by the experience of having many of her novels adapted for stage and screen.
As a point of comparison, I calculated the same proportions for four of Elmore Leonard's novels:
Novel | Date | Total chars | Quote chars | Percent |
The Law at Randado | 1954 | 313408 | 105701 | 33.73% |
Maximum Bob | 1991 | 404089 | 177282 | 43.87% |
Road Dogs | 2009 | 387458 | 226695 | 58.51% |
Raylan | 2012 | 338665 | 205562 | 60.70% |
There are a lot more data points to consider — Christie wrote more than 70 novels, and Leonard has written 60 or so. But this small sample suggests that a trend towards more dialogue over the course of an author's career is not implausible.
N.B. Where possible, I measured spans between Unicode
|“| 0x201C "LEFT DOUBLE QUOTATION MARK"
and
|”| 0x201D "RIGHT DOUBLE QUOTATION MARK"
Where the texts available to me had only invariant double quotes, my program attempts to parse out the quoted segments from the rest by using even and odd ordinal counts, along with considerations of paragraph structure and so on.
In both cases, it's possible that errors in the texts or in the algorithms messed up the counts, but various spot checks suggest that the results are generally reliable.
Craig said,
December 29, 2017 @ 2:22 am
Your survey of Christie seems to have a huge gap between 1930 and 1971. It would be interesting to fill that in with a few novels of the '30s, '40s, '50s, and '60s to see how consistent the trend is.
[(myl) There are three reasons for that: (1) The publications whose work we're replicating and extending claim a possible effect of Alzheimers-related cognitive decline on Christie's later work, and so I wanted to check early works against later ones; (2) The process of making Kindle editions of such books ready for analysis involves 5-10 minutes of work per book; (3) The Kindle editions cost around $8-10 each, and (misplaced?) habits of frugality made me reluctant to invest hundreds of dollars in what might not be an interesting exploration.]
I'd be interested to see what the percentage of dialogue is in George V. Higgins' classic crime novel "The Friends of Eddie Coyle". There is a LOT of dialogue in that book.
[(myl) 174,925 characters in quoted spans out of 249,306 characters overall, or 70.16%.]
Joyce's "Ulysses" would be interesting too, but since it doesn't use quotation marks (which Joyce considered ugly), your methodology would have to change.
Aylok said,
December 29, 2017 @ 2:37 am
Note that 'Nemesis' was written during WW2 although it was not published until the 70s.
[(myl) What's your evidence for this notion? There's no mention of it in Wikipedia's Agatha Christie bibliography, nor in the Wikipedia page for the novel Nemesis itself, nor in the published work itself. The Christie bibliography gives earlier dates of original composition (1940s and 1954) for the last three of her books to be published (Curtain, Sleeping Murder, Hercule Poirot and the Greenshore Folly), which is why I didn't use them.]
Athel Cornish-Bowden said,
December 29, 2017 @ 2:49 am
Then there is Ivy Compton-Burnett, whose novels consist of little else but dialogue.
[(myl) Her 1951 work Darkness and Day has 333,354 characters in quoted spans, out of 412,285 total characters, for a percentage of 80.86%.]
Adam Roberts said,
December 29, 2017 @ 6:37 am
It's petty of me, but "Akroyd" should be "Ackroyd".
Philip Roth's Deception (1990) is written entirely in dialogue: no descriptive passages, exposition etc at all, just the two main characters talking. It was a deliberate experiment by Roth, and isn't (I'd say) his most successful novel.
Leslie Katz said,
December 29, 2017 @ 7:52 am
"[Elmore] Leonard has written 60 [novels] so far."
Any new ones will be written by ouija board.
[(myl) We can always hope.]
dfan said,
December 29, 2017 @ 8:15 am
And then there are the novels of William Gaddis, which consist almost entirely of (unattributed!) dialogue. The first (The Recognitions) isn't like that, so I guess there's another case of percentage of dialogue increasing over time…
DaveK said,
December 29, 2017 @ 9:38 am
One immediate thought: composing convincing dialogue is one of the more difficult parts of writing and as a writer gains more experience, it's not surprising that they would feel more confident in their ability to write dialogue and use it more.
Martin Ball said,
December 29, 2017 @ 12:23 pm
@Aylok I think you're confusing this novel with Sleeping Murder which was the last Miss Marple to be published though written during the Second World War.
Bob said,
December 29, 2017 @ 1:57 pm
As I read the first paragraphs, I thought to myself, "I wonder how Elmore Leonard — the King of Dialog — would stack up?" And lo, you answered my question! Thanks for this very interesting post.
rpsms said,
December 29, 2017 @ 3:18 pm
Is there any indication that any of these authors used assistant writers (etc) as they became more successful?
maidhc said,
December 29, 2017 @ 3:23 pm
I remember The Crying of Lot 49 as having a lot of inner dialogue and reported speech. "Oedipa wondered whether…", "X. had told her that…" I don't have a copy available at the moment.
It could be an example of what DaveK mentions. I guess you'd have to do a study over Pynchon. It seems to me that his later works seem to have more prominent dialogue. I certainly have vivid memories of "'ass' is an intensifier!" and the 18th-century teenage girl who keeps sticking "as if" into her sentences.
[(myl) In the Pynchon texts that I happen to have immediately at hand, we get a replication of the spooky gradual increase of quote-span percentages over time:
Rubrick said,
December 29, 2017 @ 5:10 pm
"But does it mean anything?"
Liberman smiled. "Wrong question. What you meant to ask is Does it mean anything interesting."
Rubrick said,
December 29, 2017 @ 5:11 pm
See, I can't even get my punctuation right. No wonder all these authors wait until they have more practice.
ohwilleke said,
December 29, 2017 @ 6:02 pm
Shorter OP: Good writing has lots of dialog, bad pedantic writing enjoyed only by English lit professors has little dialog.
Don Sample said,
December 29, 2017 @ 8:55 pm
What is the distinction between a genre novel, and a non-genre novel?
[(myl) According to Wikipedia:
Genre fiction, also known as popular fiction, is plot-driven fictional works written with the intent of fitting into a specific literary genre, in order to appeal to readers and fans already familiar with that genre. Genre fiction is generally distinguished from literary fiction. […]
The main genres are crime, fantasy, romance, science fiction, western, inspirational and horror.
]
Adrian Morgan said,
December 30, 2017 @ 7:44 am
Like many people, I regard the notion of genre vs non-genre fiction as absurd as the notion of accented vs non-accented speech, for similar reasons. But I can accept the idea that classifiability, like humour, is a quality that individual works possess to a greater or lesser extent. So perhaps the concept of genre fiction has merit as a prototype but not as a category.
I've been struck by the proportion of dialogue in many mystery novels, Christie's included, but have always interpreted it as a convenient device for presenting clues from the detective's perspective without prematurely revealing too much of their contemplation. (I don't personally read them.)
McLemore said,
December 30, 2017 @ 2:58 pm
@DaveK, I concur.
Back when we were assembling a collection of Rex Stout mysteries, I came across a couple of his earlier books. (The "two serialized murder mystery novellas that prefigured elements of the Wolfe stories" mentioned on Wikipedia, I suppose.) They were abysmal! Pretty much unreadable. And what I remember of them is that they were almost entirely dialogue.
But he definitely got better at it in subsequent decades. I wonder if he then followed the trend observed by Mark.
Andrew (not the same one) said,
December 30, 2017 @ 5:18 pm
I think it should be noted that while genre fiction is indeed often distinguished from literary fiction, this does not give an accurate picture of the situation; there is a large amount of fiction which is neither, e.g. the works of Jeffrey Archer or Jilly Cooper (who also writes romance, but her books about horses are just listed as 'fiction'). It is not sold under the heading of any particular genre, but neither is is taken seriously by critics in a way that would entitle it to the term 'literary'.
I'm a bit puzzled by 'inspirational'; what would be an example of that? On the other hand, one might think that historical fiction should be listed as a genre; although it's often shelved under 'fiction', it's a recognised sub-group within that category, and some people are primarily readers of it.
Bart said,
December 31, 2017 @ 1:54 am
If you were going to do a lot of detailed work analysing the texts of a number of novels, why would you make the assumption before you even started that there was an important distinction between so-called genre fiction and so-called literary fiction?
Seajn M said,
December 31, 2017 @ 2:55 am
Adrian Morgan: Trying to fit works from 1882, 1898, and 1908 from the literary/genre dichotomy of mid-20th-century bookshops does not seem very useful to me. After all, Sir Arthur Conan Doyle wanted to move away from his trashy detective stories to write something respectable … historical novels in a romantic middle ages!
But this idea is not really an important part of Mark Lieberman's analysis!
Jerry Friedman said,
December 31, 2017 @ 10:40 am
Adrian Morgan: Thank you. I agree that "literary fiction" is also a genre, maybe with sub-genres.
Such things were discussed here in "Annals of overgeneralization" and its sequel, "'Literary' vs. 'popular' fiction again".
DWalker07 said,
January 3, 2018 @ 4:14 pm
Re Genre fiction: Ursula Le Guin interviewed here https://electricliterature.com/ursula-k-le-guin-talks-to-michael-cunningham-about-genres-gender-and-broadening-fiction-57d9c967b9c
Michael Cunningham: Could you talk about that? About the breaking-down of the barriers between “genre” books and the books that are generally piled on the front tables at Barnes & Noble? This is especially important to me, in that I’m always trying to talk readers into venturing into genre fiction, and still encounter a surprising degree of resistance. The line, “I don’t read science fiction” emanates from a surprising number of well-educated, erudite mouths.
Ursula K. Le Guin: Well, you’ve said much of what I’d have said, and I’m delighted to hear it said by a writer whose fame is not within a “genre” but in what is still called literary fiction.
And that, of course, is the lingering problem: The maintenance of an arbitrary division between “literature” and “genre,” the refusal to admit that every piece of fiction belongs to a genre, or several of genres.