Alexa disguises her name?

« previous post | next post »

"Alexa Loses Her Voice" won USA Today's Super Bowl Ad Meter:

I believe that this was also the first Super Bowl ad to raise a technical question about speech technology.

Brad Stone, "Here’s Why Alexa Won’t Light Up During Amazon Super Bowl Ad", Bloomberg 2/2/2018:

Amazon.com Inc. is advertising its Alexa-powered speakers in the big game on Sunday. It’s an amusing 90 seconds that features celebrities like Gordon Ramsay, Rebel Wilson, Anthony Hopkins, Cardi B and the world’s wealthiest man, Jeff Bezos himself.

The word “Alexa” is uttered 10 times during the Super Bowl spot, but thankfully, the Amazon Echo in your living room isn’t going to perk up and try to respond. An Amazon spokeswoman is guarded about explaining exactly why, saying only, “We do alter our Alexa advertisements … to minimize Echo devices falsely responding in customer’s homes.”

Bezos and company have evidently been thinking about this problem for a long time, before the Echo was even introduced. A September 2014 Amazon patent titled “Audible command filtering” describes techniques to prevent Alexa from waking up “as part of a broadcast watched by a large population (such as during a popular sporting event),” annoying customers and overloading Amazon’s servers with millions of simultaneous requests.

You can read the patent for yourself — but a more accessible source is a 2/2/2018  post on Amazon's dayone blog, "Teaching Alexa when not to respond", which describes in somewhat vague terms the "acoustic fingerprinting" technique that sends out "do not respond to this" notices in advance of planned events like the Super Bowl ad, and respond quickly to unplanned media instances of the wake-up signal:

Our advertising, engineering, and science teams are able to anticipate major events like the Super Bowl, but what happens when someone like Tonight Show host Jimmy Fallon does a comedy routine about Alexa, which the team couldn’t anticipate?

Manoj Sindhwani, director for Speech Recognition, explains that our teams build acoustic fingerprints on-the-fly within our AWS cloud. When multiple devices start waking up simultaneously from a broadcast event, similar audio is streaming to Alexa’s cloud services. An algorithm within Amazon’s cloud detects matching audio from distinct devices and prevents additional devices from responding. The dynamic fingerprinting isn’t perfect, but as many as 80 to 90 percent of devices won’t respond to these broadcasts thanks to the dynamic creation of the fingerprints.

The Bloomberg piece also describes a second technique:

About a year ago, a Reddit user calling himself Asphyhackr did a little more legwork and concluded that Amazon was creatively employing this second technique. By running Alexa commercials through digital audio editing software, Asphyhackr discovered that Alexa ads transmit weakened levels of sound in an upper portion of the audio spectrum, between 3,000 and 6,000 hertz, outside the most sensitive range of human hearing.

Asphyhackr speculated that Amazon could be tipping Alexa off to ignore certain commands if it detects artificial gaps or bumps in the spectrum. To test his theory, he recorded someone saying “Alexa” and used a so-called band-stop filter that reduced frequencies just in this high region of the spectrum. When he played back the recording, “My echo would not wake, even sitting right next to the speakers!” he wrote.

Whatever the situation "about a year ago", the Amazon blog post suggests that this is not the technique used in the Super Bowl commercial, and a quick spectral analysis of the 10 instances of "Alexa" in that ad agrees.

Update — And then there's Southern Alexa, no doubt the first of many regional and ethno-cultural variants :-):

 



9 Comments

  1. Frans said,

    February 5, 2018 @ 10:48 am

    Apparently, "this video is not available in your country." Amazon's marketing department might be missing a few screws in their promotional acumen.

  2. Ben said,

    February 5, 2018 @ 11:53 am

    What struck me was Bronx native Cardi B's pronunciation of "awxygen." Where did that pseudo-New England vowel come from?

  3. Anna said,

    February 5, 2018 @ 1:38 pm

    This video is not available in my country either, Frans. I don't understand these restrictions because it's available to all and sundry on youtube:

    https://www.youtube.com/watch?v=ksnvi6c9sAk

  4. Mary Kuhner said,

    February 5, 2018 @ 6:51 pm

    "[O]ur teams build acoustic fingerprints on-the-fly within our AWS cloud. When multiple devices start waking up simultaneously from a broadcast event, similar audio is streaming to Alexa’s cloud services. An algorithm within Amazon’s cloud detects matching audio from distinct devices and prevents additional devices from responding."

    In other words, you are bugging your own environs by owning an Alexa-enabled device: any time anyone mentions Alexa, the sounds around you are now being analyzed in the cloud. Of course this could be put to various purposes other than avoiding overreacting to ads. Also of course, you could turn off the "mentions Alexa" part of the routine and have it bug all the time, though you would probably overload your cloud servers if you did this on a large scale.

    I think I'll pass on this technology.

    [(myl) In fairness to Amazon and Alexa, there's another interpretation that should actually be more efficient from their point of view. They monitor a large number of broadcast, cable, and other media feeds, along with the time stamps of Alexa wake-ups; and when they detect a statistically unexpected spike in Alexa responses, they look in the media feeds for the stimulus behind the response. If they find it, they use its signature to vaccinate Alexa units against future responses to that particular stimulus.

    I'm not clear how this can happen fast enough so that "as many as 80 to 90 percent of devices won’t respond to these broadcasts thanks to the dynamic creation of the fingerprints" — but listening to the content of the devices in everyone's homes doesn't help with that problem.]

  5. boiko said,

    February 6, 2018 @ 12:08 am

    @Mary Kuhner: https://xkcd.com/1807/

  6. Graeme said,

    February 6, 2018 @ 5:50 am

    Aside from assisting the disabled or infirm, what is the point of such gadgets? To render McMansion dwellers slowly infirm through lack of movement around the house?

  7. GH said,

    February 6, 2018 @ 11:35 am

    @Mary Kuhner:

    Also of course, you could turn off the "mentions Alexa" part of the routine and have it bug all the time

    Actually, you couldn't (with the current hardware). The way these assistants are built, when in "sleep mode" they can only listen for hardcoded "wake words" ("Google", "Alexa").

    Could they listen all the time? Sure, in theory. So could your phone.

  8. DWalker07 said,

    February 8, 2018 @ 3:15 pm

    "Aside from assisting the disabled or infirm, what is the point of such gadgets? To render McMansion dwellers slowly infirm through lack of movement around the house?"

    You mean "what's the point of having Alexa"? I don't think it tends to make people infirm. My sister uses it when she's walking around inside her house, perhaps about to go out, and she'll ask it what the weather forecast is for later today.

    Or she'll ask it to add an item to a grocery list. Or to wake her up in an hour if she's napping. Or to play some music when friends are over. A friend of mine asked it what a celebrity's age was. He could have typed that into a Web search, but he asked Alexa instead. Neither method of finding this out would contribute more, or less, to physical infirmity.

    This is not meant as an ad; I don't personally have one of these things (although I buy from the Amazon site all the time). Your question just struck me as odd.

  9. mg said,

    February 8, 2018 @ 5:16 pm

    Alexa is a seriously important accessibility aid for the blind.
    https://www.pcmag.com/news/358338/why-amazons-alexa-is-life-changing-for-the-blind

RSS feed for comments on this post