Rhetoric as music

From Jon Stewart's 1997 interview with George Carlin (starting at about 1:17.6):

well- well uh to- to go backward with the question,
don't forget, what we do is oratory.
It's rhetoric.
It's not just comedy, it's a form of rhetoric
and- and with rhetoric, you- you look and you listen for rhythms,
you- you look for ways
to sing at the same time you're talking, and to go
[skat-like phrases, based on rhythmic patterns of /d/-initial syllables…]

The English orthographic system doesn't offer a very good way to transcribe such non-syllable patterns . Here's the audio:

YouTube's transcription system is barely aware that there's an issue, sadly mis-rendering it this way:

I don't better doom did oh don't
don't don't
and don't don't don't don't don't don't
better doo doo doo doo doo

We could try using fake-orthographic pronunciation methods, as American English dictionaries generally still do. Adding capitalization for word stress, we might get something like this:

ahduh DOON dahduh DOON duhduh DOON DOON DOON,
and  DOON duh DOON duh DOON DOON.
duh duh DOO, duh duh DOO,
duh duh duh DOO,
duh duh duh DOO.

Or we could use an IPA-ish rendering of (some theory of) English surface-phonological forms — I plan to set that as a homework problem in ling0001

Meanwhile, there's another interesting aspect of the regular-English part of this short excerpt. My transcription has 56 "words", of which 5 are what I've called "repetition disfluencies" — and which I've argued should better be called (a kind of) "interpolations", since they're ubiquitous in fluent spontaneous speech, as discussed in Hong Zhang's 2020 thesis.

It should not be surprising that almost 10% of George Carlin's "words" are fluent initial repetitions of this kind — as I said, these events are ubiquitous in spontaneous speech, though they're essentially never found in fluent reading. (Which is probably why they've so rarely been studied, since linguists mostly study read speech when they study speech at all…)

And he shows the same normal spontaneous-speech patterns in other interviews, for example in this 2004 conversation with Terry Gross on Fresh Air.

But if we look at George Carlin's stand-up comedy, "interpolations" (or whatever we choose to call them) are absent, at least in the examples I've skimmed. For example:

Presumably this means that he's performing prepared and memorized material, which makes it like reading — though I also have the impression that his different performances of the same routine are not transcriptionally identical.

And other comedians don't all show the same lack of disfluencies/interpolations in their performances.


  1. Ben Zimmer said,

    September 17, 2023 @ 11:07 am

    "Presumably this means that he's performing prepared and memorized material, which makes it like reading…"

    Carlin was noted for using memorization to prepare his stage act, even for very long bits. Judd Apatow and Jon Stewart talk about his process at ~21:00 here.

  2. Gregory Kusnick said,

    September 17, 2023 @ 1:33 pm

    I don't better doom did oh don't

    Channeling James Joyce?

  3. AntC said,

    September 17, 2023 @ 7:53 pm

    I wonder if the choice (in English) of active vs passive structure, or relative positioning of prepositional phrases also has an explanation in rhythm: more short, unstressed words gives the chance to 'patter' then hit the semantic payload at the cadence. duh duh duh DOON DOON.

  4. Jerry Packard said,

    September 18, 2023 @ 8:12 am

    In my favorite Carlin bit – the one on ‘time’ – he doesn’t drop a beat.

  5. DaveK said,

    September 18, 2023 @ 9:46 am

    It’s always struck me that skillful fiction will need to work and re-work a passage of dialogue to make it sound realistic—in other words, make it sound like what someone with no particular knack for language might utter without thinking about it.

  6. Ross Presser said,

    September 18, 2023 @ 10:46 am

    In the recent past I have listened "with one ear" to a CNN broadcast with Wolf Blitzer, while I was also reading from my phone. And Blitzer has some really obvious recurrent "melodies" in his speech; sentences have a predictable rise and fall that is sort of unique to Blitzer and can be heard over and over, with occasional variations. Most of the other CNN correspondents that the channel cuts over to also have predictable patterns.. It's probably endemic to journalists.

    It also brought my attention to a trick they use: on-air journalists really dislike interruptions from whomever they are interviewing. If they have to bring a sentence to an end, they do it on a quick downward melodic phrase followed very quickly by the next sentence which starts the rise again.

