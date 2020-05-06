« previous post | next post »

There are dozens of articles Out There on "Zoom fatigue", with a wide range of ideas about causes and cures.

Gianpiero Petriglieri offered the BBC a couple of hypotheses about why "Zoom calls drain your energy":

Being on a video call requires more focus than a face-to-face chat, says Petriglieri. Video chats mean we need to work harder to process non-verbal cues like facial expressions, the tone and pitch of the voice, and body language; paying more attention to these consumes a lot of energy. "Our minds are together when our bodies feel we're not. That dissonance, which causes people to have conflicting feelings, is exhausting. You cannot relax into the conversation naturally," he says.

Silence is another challenge, he adds. "Silence creates a natural rhythm in a real-life conversation. However, when it happens in a video call, you became anxious about the technology." It also makes people uncomfortable. One 2014 study by German academics showed that delays on phone or conferencing systems shaped our views of people negatively: even delays of 1.2 seconds made people perceive the responder as less friendly or focused.

Marissa Shuffler suggested that

if we are physically on camera, we are very aware of being watched. "When you're on a video conference, you know everybody's looking at you; you are on stage, so there comes the social pressure and feeling like you need to perform. Being performative is nerve-wracking and more stressful." It's also very hard for people not to look at their own face if they can see it on screen, or not to be conscious of how they behave in front of the camera.

There are lots of other hypotheses Out There as well — we need to work harder to process non-verbal cues; we feel anxious about our remote workspace; there aren't any "water-cooler catch-ups"; virtual meetings blur the work/life balance; it's easier to get distracted and peek at email or phone messages; different meetings are in different cultural contexts, but you're always in the same physical place; and so on.

The most interesting (or at least unexpected) idea, in my opinion, is that it's a sort of intimacy overdose. From Beckie Supiano, "Why is Zoom So Exhausting?", The Chronicle of Higher Education 4/23/2020:

Using Zoom — at least with the standard settings — means looking right into other people's faces at close range, says Jeremy Bailenson, a professor of communication at Stanford University and founding director of its Virtual Human Interaction Lab. That isn't what people do in a classroom, or a meeting, or most social situations. "In the real world," Bailenson says, "when someone gets that close up, we get aroused. There's probably some type of a conflict situation, from an evolutionary standpoint — or we're going to be intimate with them."

Professors, Bailenson says, can mitigate this dynamic by playing around with their settings, perhaps using an external camera or a second monitor. Or they might consider a different kind of platform, says Bailenson, who has provided consulting to or received academic grant funding from many of the big players in virtual-reality technology.

"The most important thing I can say," he says, "is think really hard: Does this conversation get augmented by having everyone see one another's faces?"

But the one aspect of this, as far as I know that has actually been studied, as far as I know, is the effect of coding and transmission latency. I first heard about this issue when I worked at AT&T Bell Laboratories after 1975, from the folks working on digital speech coding and transmission. Thus J. W. Forgie, "Speech transmission in packet switched store and forward networks", Proc. NCC 1975:

Although delay has no effect on the intelligibility or naturalness of a speech signal, when it is introduced into a conversational situation, it becomes readily detectable and can have disruptive effects on the conversation. With the anticipated use of stationary satellites for speech communication, experiments were undertaken to evaluate the effects of delays of the order of 0.6 seconds which would be expected in the round-trip time to such satellites. The results showed that the effects of delays of this amount or more were largely of a psychological nature. Telephone conversations normally involve frequent interaction between the participants even though one person may be doing most of the talking for an extended period. When the reinforcing feedback of an expected "yes", "really?", or whatever is delayed, the talker gets the feeling that the other party is not paying proper attention, and he tends to become irritated. Similarly, when the other party tries to interrupt the speaker, he becomes annoyed because the speaker appears to be ignoring his attempt to interrupt.

Packet-switched networks (like the internet) create the additional problem of glitches — with a trade-off between glitch probability and longer lag times:

The variability of the delay in a store-and-forward network also poses problems for speech communication. For example, the transmitter may chop the input speech into chunks of equal length and give the corresponding messages to the net at equal time intervals. When they arrive at the receiver, the time between messages is no longer likely to be uniform but will generally exhibit considerable variation, and the receiver must take appropriate action to compensate for this jitter. […]

When the receiver is reconstituting the speech from the message stream, and a message has been abnormally delayed in the net, a point may be reached where all the available messages have been used up. If this point corresponds to a pause in the input speech, all will be well. Otherwise a gap or "glitch" will be introduced into the output speech. […]

Unlike delay, which has a primarily psychological effect on a conversation, glitches can effect the intelligibility of the output speech. […]

In order to keep the glitch probability low, the receiver will have to introduce some additional delay in the speech stream to smooth the jitter in message arrival times. The magnitude of the smoothing delay required to achieve a glitch probability less than some given value will depend on the dispersion of network transit times.

In the early 90s, Amit Shah et al. "Multimedia over FDDI" (Proceedings 17th Conference on Local Computer Networks, 1992) "decided that the maximum end-to-end tolerable latency was 100 ms.", summarizing their interpretation of the situation in this table:



Anyone who's taken part in distributed virtual meetings knows that delays and glitches, whatever their actual distribution, are well beyond "annoying".

See also Brid O'Conaill et al., "Conversations over video conferences: An evaluation of the spoken aspects of video-mediated communication." Human-computer interaction (1993) — and the intimidatingly many other papers linked at interruptions.net.

