Language Log

AI-assisted substitute vocal cords

March 21, 2024 @ 11:58 pm · Filed by Victor Mair under Acoustics, Artificial intelligence, Biology of language, Language and medicine, Language and technology

« previous post | next post »

This is what the device looks like and how it is made:

Jun Chen Lab/UCLA
The two components — and five layers — of the device allow it to turn muscle
movement into electrical signals which, with the help of machine learning,
are ultimately converted into speech signals and audible vocal expression.

"Speaking without vocal cords, thanks to a new AI-assisted wearable device"

The adhesive neck patch is the latest advance by UCLA bioengineers in speech technology for people with disabilities"

Christine Wei-li Lee, UCLA Newsroom (March 14, 2024)

Key takeaways

- Bioengineers at UCLA have invented a thin, flexible device that adheres to the neck and translates the muscle movements of the larynx into audible speech.
- The device is trained through machine learning to recognize which muscle movements correspond to which words.
- The self-powered technology could serve as a non-invasive tool for people who have lost the ability to speak due to vocal cord problems.

In the past, I recall witnessing individuals who had lost the function of their vocal cords holding a battery powered device that made a buzzing sound (like a Jew's harp) to their throat and making different shapes with their mouth to produce what resembled words.

People with voice disorders, including those with pathological vocal cord conditions or who are recovering from laryngeal cancer surgeries, can often find it difficult or impossible to speak. That may soon change.

A team of UCLA engineers has invented a soft, thin, stretchy device measuring just over 1 square inch that can be attached to the skin outside the throat to help people with dysfunctional vocal cords regain their voice function. Their advance is detailed this week in the journal Nature Communications.

The new bioelectric system, developed by Jun Chen, an assistant professor of bioengineering at the UCLA Samueli School of Engineering, and his colleagues, is able to detect movement in a person’s larynx muscles and translate those signals into audible speech with the assistance of machine-learning technology — with nearly 95% accuracy.

The breakthrough is the latest in Chen’s efforts to help those with disabilities. His team previously developed a wearable glove capable of translating American Sign Language into English speech in real time to help users of ASL communicate with those who don’t know how to sign.

The tiny new patch-like device is made up of two components. One, a self-powered sensing component, detects and converts signals generated by muscle movements into high-fidelity, analyzable electrical signals; these electrical signals are then translated into speech signals using a machine-learning algorithm. The other, an actuation component, turns those speech signals into the desired voice expression.

The two components each contain two layers: a layer of biocompatible silicone compound polydimethylsiloxane, or PDMS, with elastic properties, and a magnetic induction layer made of copper induction coils. Sandwiched between the two components is a fifth layer containing PDMS mixed with micromagnets, which generates a magnetic field.

Utilizing a soft magnetoelastic sensing mechanism developed by Chen’s team in 2021, the device is capable of detecting changes in the magnetic field when it is altered as a result of mechanical forces — in this case, the movement of laryngeal muscles. The embedded serpentine induction coils in the magnetoelastic layers help generate high-fidelity electrical signals for sensing purposes.

Measuring 1.2 inches on each side, the device weighs about 7 grams and is just 0.06 inch thick. With double-sided biocompatible tape, it can easily adhere to an individual’s throat near the location of the vocal cords and can be reused by reapplying tape as needed.

Voice disorders are prevalent across all ages and demographic groups; research has shown that nearly 30% of people will experience at least one such disorder in their lifetime. Yet with therapeutic approaches, such as surgical interventions and voice therapy, voice recovery can stretch from three months to a year, with some invasive techniques requiring a significant period of mandatory postoperative voice rest.

“Existing solutions such as handheld electro-larynx devices and tracheoesophageal- puncture procedures can be inconvenient, invasive or uncomfortable,” said Chen who leads the Wearable Bioelectronics Research Group at UCLA, and has been named one the world’s most highly cited researchers five years in a row. “This new device presents a wearable, non-invasive option capable of assisting patients in communicating during the period before treatment and during the post-treatment recovery period for voice disorders.”

If it proves practicable, this revolutionary new device is sure to be a great for individuals who have have speech production difficulties.

Selected readings

"Nasality" (8/18/23)
"Allergese" (4/30/15)

March 21, 2024 @ 11:58 pm · Filed by Victor Mair under Acoustics, Artificial intelligence, Biology of language, Language and medicine, Language and technology

Permalink

5 Comments »

Jarek Weckwerth said,

March 22, 2024 @ 2:40 am

It's quite interesting how those press releases are prepared. Are they ever (proof)read by the original authors? As a phonetician, I still don't know how the thing operates. In particular, the release says nothing about how the "speech signals" are produced. (Not to mention the fact that the expression "speech signals" is so vague as to be 100% unhelpful.) If it's by substituting the physical output of the vocal folds, then some sort of "buzzer" or another sound source within the vocal tract is necessary anyway. But I would imagine the receiver device performs your usual modern Text-to-Speech, or even controls your phone to achieve that. Nothing about this in the release. I'll have to check out the original paper.
Jarek Weckwerth said,

March 22, 2024 @ 3:02 am

OK, after a very quick (!) read of the original, it looks like the device itself contains a speaker-like component that actually produces speech-like sound. On the one hand, very impressive. On the other, the original is very very terse on the linguistic side (to the extent of using words such as "semantical"; not at all impressed by this!); the paper is mainly materials science, talking about the properties of the wearable patch. The main weakness of a system like this is that it can't capture the actual articulatory movements in the mouth. That's what the machine learning part is for: to guess (!) the intended words from the laryngeal activity only (which has always been claimed to be insufficient for this). It will be interesting to see where this goes. But I would think putting the "acoustic actuator" within the lower vocal tract would achieve more robust results.
Benjamin Ernest Orsatti said,

March 22, 2024 @ 8:00 am

I share Mr. Weckwerth's concern about "burying the lede". I read both the source article and the paper (https://www.nature.com/articles/s41467-024-45915-7) and, even with a Penn education (!), I still find myself flummoxed on two issues:

(1) How is this not a perpetual motion machine? What's runnin' the thing? Does the 2nd Law of Thermodynamics have an Enforcement Officer I can call to report this violation?

(2) Is the "speaker" the outer two layers? How in the world can that produce audible sound, especially given the concomitant need for an energy source? Last time I looked into how speakers work (c. 1990), you needed some kind of vibrating membrane or whatnot.
Gregory Kusnick said,

March 22, 2024 @ 9:21 am

Benjamin: Moving magnets in proximity to coiled wires is sufficient to produce detectable electrical signals; no perpetual motion there. If those signals are to be analyzed by "a machine-learning algorithm", then presumably there's a cable connecting the patch to a wearable computing device, which presumably also provides power to the speaker portion of the patch.
Drew Pidkameny said,

March 25, 2024 @ 3:06 pm

How's that sign language glove going?

To be clear, the glove is at best a joke and at worst an insult to anyone who actually uses sign language, and does not inspire confidence in any of Chen's efforts.

RSS feed for comments on this post

AI-assisted substitute vocal cords

5 Comments »

Jarek Weckwerth said,

Jarek Weckwerth said,

Benjamin Ernest Orsatti said,

Gregory Kusnick said,

Drew Pidkameny said,

Leave a Comment

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta