Parse depth in essays vs. novels

« previous post | next post »

In "Trends" (3/27/2022) and "Embedding Depth" (11/28/2022), I noted that Earnest Hemingway's reputation for "little short sentences" is generally false to fact. I made the point by comparing the distribution of sentence lengths and embedding depths in his memoir A Moveable Feast to Usula K. Le Guin's essay collection The Wave in the Mind.

In a comment on "Embedding Depth", Bloix complained that A Moveable Feast is probably not "a reliable example of the style that made [Hemingway] famous in the 1920s and 30s." In today's post, I'll explain again why I chose that work, amplify the point by comparing Hemingway's 1926 novel The Sun Also Rises to Le Guin's 1974 novel The Dispossessed, and wave my hand at broader generalizations about dialogue vs. exposition and fiction vs. essays.

Hemingway's reputation for short sentences goes back at least to James Thurber in 1927, and Ursula K. Le Guin made a striking use of the meme in her 1992 essay "Introducing Myself":

What it comes down to, I guess, is that I am just not manly. Like Ernest Hemingway was manly. The beard and the guns and the wives and the little short sentences. I do try. I have this sort of beardoid thing that keeps trying to grow, nine or ten hairs on my chin, sometimes even more; but what do I do with the hairs? I tweak them out. Would a man do that? Men don’t tweak. Men shave. Anyhow white men shave, being hairy, and I have even less choice about being white or not than I do about being a man or not. I am white whether I like being white or not. The doctors can do nothing for me. But I do my best not to be white, I guess, under the circumstances, since I don’t shave. I tweak. But it doesn’t mean anything because I don’t really have a real beard that amounts to anything. And I don’t have a gun and I don’t have even one wife and my sentences tend to go on and on and on, with all this syntax in them. Ernest Hemingway would have died rather than have syntax. Or semicolons. I use a whole lot of half-assed semicolons; there was one of them just now; that was a semicolon after “semicolons,” and another one after “now.”

And another thing. Ernest Hemingway would have died rather than get old. And he did. He shot himself. A short sentence. Anything rather than a long sentence, a life sentence. Death sentences are short and very, very manly. Life sentences aren’t. They go on and on, all full of syntax and qualifying clauses and confusing references and getting old. And that brings up the real proof of what a mess I have made of being a man: I am not even young. Just about the time they finally started inventing women, I started getting old. And I went right on doing it. Shamelessly. I have allowed myself to get old and haven’t done one single thing about it, with a gun or anything.

And as part of an exploration of historical trends in English (published writing, anyhow), I had noticed that the sentences in Hemingway's novels are actually a bit on the long side for their time, on average. So I thought it would be amusing to compare his sentence-length distributions to Le Guin's. I wanted to use her "Introducing myself" essay as a point of comparison, and therefore I chose Hemingway's memoir rather than his novels, since quoted dialog in fiction generally has shorted sentences than expositional writing does. But in response to Bloix's comment, let's compare Hemingway's 1926 novel The Sun Also Rises to Le Guin's 1974 novel The Dispossessed, in terms of the distribution of sentence lengths:

Obviously these distributions involve shorter sentences than those from the previously-graphed memoir and essay collections — and equally obviously, not all novels have the same distribution of sentence lengths, as we can see by adding the quantiles for Jane Austen's Pride and Prejudice and Cormac McCarthy's All the Pretty Horses:

Nor should it be a surprise that the distribution of parse depths tends to be substantially shorter in each author's novels, compared to their expository writings — here's a plot the for Hemingway and Le Guin works we've been discussing:

And a similar result for the distribution of clause depths:


  1. KeithB said,

    December 1, 2022 @ 4:10 pm

    I wonder why "tweeze" is not a verb, though it appears that "tweaser" is.

  2. GeorgeW said,

    December 1, 2022 @ 4:56 pm

    @KeithB: It should be, it's shorter.

  3. Matt Juge said,

    December 1, 2022 @ 6:44 pm

    Tweeze certainly is a verb. The earliest OED example is from 1932 (with tweezer rather than tweezers.

  4. Matt Juge said,

    December 1, 2022 @ 6:45 pm

    Here's the missing parenthesis: ).

  5. S Frankel said,

    December 1, 2022 @ 8:01 pm

  6. Itüpflreiter said,

    December 3, 2022 @ 6:13 am

    @MarkLiberman: Would you mind checking the labels of the axes of the first two diagrams?

    [(myl) The x-axis is labelled "Quantile"; the y-axis is labelled "Sentence Length (words)". Those labels are correct — why do you ask?]

  7. RfP said,

    December 3, 2022 @ 3:15 pm

    People who are interested in the artistic dimension of this might want to read “What We Talk About When We Talk About Flow,” an essay by David Jauss in his book On Writing Fiction (Also published as Alone with All That Could Happen).

    Jauss maintains that:

    Hemingway’s simplicity is far more a matter of diction than of syntax. Like [D.H.] Lawrence, Hemingway knew how to vary sentence structure so his paragraphs flow. If you look at random paragraphs from his work, you’ll notice how the simplicity of his diction exists within the context of complex syntax. The opening paragraph of “A Clean, Well-Lighted Place” is a good example.


    There’s nothing wrong with simplicity, in short, if it’s only apparent, not actual. The best simple writing is, at its deepest level, the level of structure, complex.

    In addition to citing the Virginia Woolfe quote from which Ursula K. Le Guin drew the title of The Wave in the Mind, Jauss also mentions the work of Virginia Tufte, whose book Artful Sentences: Syntax as Style “presents — and comments on — more than a thousand excellent sentences chosen from the works of authors in the twentieth and twenty-first centuries. The sentences come from an extensive search to identify some of the ways professional writers use the generous resources of the English language.” (From her son’s website,

  8. JPL said,

    December 3, 2022 @ 8:14 pm


    On going to Wikipedia to check the spelling of Hemingway's given name (a question that did not arise in the earlier post), I came across in the second sentence a reference to the "iceberg theory", a term H. used to refer to the system behind his "economical and understated style". Apparently, whereas authors such as James and Lawrence aimed to describe explicitly the dark undercurrents and the dependency structure of the tacit background of personal purposeful action, Hemingway tried to evoke this implicit background of thought below the surface of conscious awareness, and its context of the core purposes and values of his characters, by omitting any explicit reference to it, focusing instead on the overt acts of the character and its concrete context, and allowing the reader to "fill in the gaps": through the understanding of the character as an (imagined) real person built up in the previous (and following) text, the reader would be allowed to imagine (perhaps also not explicitly express) the implied values, reasons, intentions, etc. that made the character make sense. It's more of a methodology than a "theory", but H. seemed to think that it had a lot to do with his distinctive "style". I had never heard of this, but I thought this was an interesting idea, and it made me want to read a Hemingway book to see just how this was done. I always regard the interpretation of a given sentence, especially a unique and problematic one, as an open-ended problem, with the notion of "what is expressed" by the sentence extending beyond what is related directly to the overt elements of the sentence, such as "core propositional content". So thank you for the reference to Virginia Tufte's book "celebrating the sentence", which I plan to check out.

  9. RfP said,

    December 3, 2022 @ 11:11 pm


    You’re very welcome!

    I think a lot about Hemingway’s iceberg theory (also sometimes referred to as the theory of omission), and I believe the following quote from Ursula K. Le Guin speaks to some of the ways in which this “hidden” content affects what’s “above the water.” Just as painters have their brushstrokes and paints and canvas—and directors have their lighting and composition and film stock (or facsimiles thereof, these days)—writers use their own tools to create feelings and impressions that aren’t necessarily expressed in an overt manner:

    The artist deals with what cannot be said in words.

    The artist whose medium is fiction does this in words. The novelist says in words what cannot be said in words.

    Words can be used thus paradoxically because they have, along with a semiotic usage, a symbolic or metaphoric usage. (They also have a sound—a fact the linguistic positivists take no interest in. A sentence or paragraph is like a chord or harmonic sequence in music: its meaning may be more clearly understood by the attentive ear, even though it is read in silence, than by the attentive intellect.)

    – “Introduction to The Left Hand of Darkness,” in The Language of the Night: Essays on Fantasy and Science Fiction, 158-159.

  10. RfP said,

    December 3, 2022 @ 11:16 pm

    …And I’d like to add a quick word about pacing.

    Will Eisner, the pioneering comics artist, once said:

    …watching film establishes a rhythm of acquisition. It is a direct challenge to static print. Accustomed to the pace of film, readers grow impatient with long text passages because they have become used to acquiring stories, ideas and information quickly and with little effort.

    – Will Eisner, Graphic Storytelling and Visual Narrative, 17
    (First published in 1996 as Graphic Storytelling)

  11. Taylor, Philip said,

    December 4, 2022 @ 7:59 am

    Not all readers — I, for one, whenever using Google to learn how to accomplish a new task, invariably ignore all links to Youtube, preferring instead to read the very "long text passages" that Will Eisner asserts readers now eschew …

  12. RfP said,

    December 4, 2022 @ 1:26 pm

    Thanks for your comment, Philip.

    Perhaps I should have given more context for why I posted the Eisner quote.

    Although he is indeed making a case for the combination of text and images in “static print,” as becomes clear in the rest of the paragraph from which I have drawn this excerpt, I feel one can also infer that this quote provides yet one more reason for authors to make their case with, shall we say, salients rather than by means of a lengthy siege.

  13. Taylor, Philip said,

    December 4, 2022 @ 5:24 pm

    "one more reason for authors to make their case with, shall we say, salients rather than by means of a lengthy siege" — a suggestion which could usefully also be heeded by those making video presentations. Although I prefer to learn by reading prose, I do from time to time also seek to learn by watching video recordings, and I am frequently annoyed by the fact that, rather than focussing on the technique which the author is setting out to teach, he/she allows him/herself to digress into various totally peripheral techniques with which the viewer is almost certainly already quite familiar. For example, in a number of video recordings which set out to teach how to create the appearance of metallic gold text in Adobe Photoshop, the authors waste considerable time telling the viewer how to select the font that he/she wishes to use, how to specify its size, weight and colour, etc. These are basics, a knowledge of which can be safely assumed, whereas the key elements of blending options, bevel & emboss, contour, etc., are relatively advanced techniques to which the viewer's attention should be drawn without prior digression.

  14. Bloix said,

    December 4, 2022 @ 7:09 pm

    I did not say that A Moveable Feast was not likely to have sentence lengths similar to those of Hemingway's style in the 1920s and 30s. I said that Hemingway's style of the 1920s and 30s is easily recognizable and that if the graphs do not show a difference between Hemingway and other authors then the graphs are missing something. (This is not a criticism; it's useful to disprove the conventional wisdom in order to clear the brush in anticipation of further research.) In response you asked what might be missing. I made three suggestions. I also observed that Hemingway's characteristic style is more pronounced in his short stories.

    You now respond to my "complaint" (I thought it was an observation) by providing more graphs that do not show a difference between Hemingway's style in long-form writing and that of other writers. (The graphs do not provide any information about the stories.) I had said that your graphs do not reveal anything notable about his style but perhaps you thought that the point needed further emphasis.

    I don't see any points of disagreement between us. You have shown, and I accept, that in novels and memoirs, Hemingway's sentences are not shorter than those of selected other writers. I have said, and you haven't disagreed, that Hemingway's style in nonetheless recognizable, which implies that it differs from other writers in ways that should be quantifiable, but are not revealed by these graphs.

RSS feed for comments on this post