Or at least another pattern of its usage.

According to Herbert Clark and Jean Fox Tree, "Using uh and um in spontaneous speaking" (Cognition 2002),

"[S]peakers use uh and um to announce that they are initiating what they expect to be a minor (uh), or major (um), delay in speaking. Speakers can use these announcements in turn to implicate, for example, that they are searching for a word, are deciding what to say next, want to keep the floor, or want to cede the floor."

As they note, the actual patterns of pause and filler durations are somewhat complicated — a larger-scale empirical survey can be found here. Extending many LLOG posts, Wieling et al. (2016) document a historical change in relative UM/UH frequencies across various Germanic languages, associated with sociolinguistic dimensions of gender, age, education, and so on. And there are well-established individual patterns of usage, as well as evidence for conversational accommodation.

But listening to a recent YouTube interview, I noticed a somewhat different pattern. An extremely fluent speaker uses a very brief "uh" as the first syllable in many of his prosodic phrases, following a brief inter-phase silence, with no post-UH silence. There's no indication that he is "searching for a word, deciding what to say next, wants to keep the floor, or wants to cede the floor", and I noticed no other filled pauses on his side of the interview. So for this speaker, phrase-initial UH seems to have become something of a habit. It's unclear what his UH-or-not choice signals, if anything.

The source is "This Is What A U.S. War With Iran Might Look Like: FDD Expert", Forbes TV 2/18/2026, in which Brittany Lewis interviews Behnam Ben Taleblu.

Here's the audio for Taleblu's first answer:

There are 10 instances of UH, with a mean duration of 150 milliseconds. In comparison, his mean syllable duration is 229 milliseconds. The 10 pre-UH silences have a mean duration of 422 milliseconds. There are 7 inter-phrase silences not followed by UH, with a mean duration of 412 milliseconds. All 17 silent pauses, with or without following UH, occur at significant syntactic boundaries, as would be the case in fluent read speech (and is definitely not usually the case in fluent spontaneous speech).

For a partial comparison, here's a table showing the distributions of UM and UH relative to preceding and following silences, from the Switchboard corpus (taken from this 2014 LLOG post):

all UM 21076 SILENCE UM SILENCE 8251 39% SPEECH UM SILENCE 7358 35% SILENCE UM SPEECH 2938 14% SPEECH UM SPEECH 2521 12% all UH 68991 SILENCE UH SILENCE 9231 13% SPEECH UH SILENCE 25150 36% SILENCE UH SPEECH 12681 18% SPEECH UH SPEECH 21696 31%

The same post has plots and parameters for the various durations involved.

Obviously there's individual as well as group variation in all dimensions of filled pause production, as this (apparently extreme) case illustrates. Despite many relevant publications, it remains unclear to what extent those individual differences correlate with personality, identity projection, etc., in production, or how they're modulated by context. And it's also unclear how consistently (or accurately) the many dimensions of filled pause production are interpreted by listeners.

For some comic relief, and a long list of related posts, see "The meaning of filled pauses?", 2/5/2022.

