The Silicon Graybeard: Hacking Alexa, Siri and Google Assistant With Hidden Voice Commands

A lot of people have written about how getting one of these voice-activated digital assistants is voluntarily bugging yourself. The reactions have widely varied, but for the software to recognize when you call it, ("OK, Google"...) it has to be listening at all times. It's a deliberate design feature, or decision. Most people who read here, at least, would be aware that they were installing a full time listening device in their homes. To some, they assume it's an invasion of privacy and don't want these things; to others, it's something they ignore for the perceived benefits of the digital assistant.

The New York Times tech blog reports that a group of researchers in a couple of institutions have been able to secretly activate the systems on smartphones and smart speakers, simply by playing music with sub-audible (to humans) sounds hidden in it over the radio.

A group of students from University of California, Berkeley, and Georgetown University showed in 2016 that they could hide commands in white noise played over loudspeakers and through YouTube videos to get smart devices to turn on airplane mode or open a website.

This month, some of those Berkeley researchers published a research paper that went further, saying they could embed commands directly into recordings of music or spoken text. So while a human listener hears someone talking or an orchestra playing, Amazon’s Echo speaker might hear an instruction to add something to your shopping list.

“We wanted to see if we could make it even more stealthy,” said Nicholas Carlini, a fifth-year Ph.D. student in computer security at U.C. Berkeley and one of the paper’s authors.

In a way, this isn't much of a surprise, right? They're taking advantage of the "always-on, always listening" nature and trying to see just what the algorithms can extract from the other sounds. I'd think the designers would do this. Further, hijacking these things is nothing new. Remember when Burger King grabbed headlines with an online ad that asked ‘O.K., Google, what is the Whopper burger?” It caused Android devices with voice-enabled search to read the Whopper’s Wikipedia page aloud. The ad was canceled after viewers started editing the Wikipedia page to make it more ... let's say comical. Not long after that, South Park followed up with an entire episode built around voice commands that caused viewers’ voice-recognition assistants to spew adolescent obscenities.

A research firm has said that devices like Alexa, Siri and Google Assistant will outnumber humans by 2021, and add that more than half of American homes will have a smart speaker by then, just three years away.

These security researchers aren't leaving bad enough alone.

Last year, researchers at Princeton University and China’s Zhejiang University demonstrated that voice-recognition systems could be activated by using frequencies inaudible to the human ear. The attack first muted the phone so the owner wouldn’t hear the system’s responses, either.

The technique, which the Chinese researchers called DolphinAttack, can instruct smart devices to visit malicious websites, initiate phone calls, take a picture or send text messages. While DolphinAttack has its limitations — the transmitter must be close to the receiving device — experts warned that more powerful ultrasonic systems were possible.

That warning was borne out in April, when researchers at the University of Illinois at Urbana-Champaign demonstrated ultrasound attacks from 25 feet away. While the commands couldn’t penetrate walls, they could control smart devices through open windows from outside a building.

This year, another group of Chinese and American researchers from China’s Academy of Sciences and other institutions, demonstrated they could control voice-activated devices with commands embedded in songs that can be broadcast over the radio or played on services like YouTube.

Security researchers have a habit of saying that releasing information like this isn't bad because they think the bad guys have either thought of it already, or they would think of it on their own. Maybe, although some times just knowing something is possible can keep the experimenter going during the inevitable times when things just don't seem to be working. The article does say these exploits haven't been found "in the wild", but as more people become aware of the possibility, I'd expect them to start showing up.

Hopefully, the research being revealed will get the companies selling this software to try to get ahead and make their devices more robust. My version: I have an older iPhone (6s) with Siri. It's possible to configure the phone to listen all the time, so that when you say, "Hey, Siri" it answers. I have that turned off, and have read Siri does not actually send data back when it's disabled.

I'm going to close with one of the last paragraphs in the article, because it contains the very best phrase in the whole piece.

“Companies have to ensure user-friendliness of their devices, because that’s their major selling point,” said Tavish Vaidya, a researcher at Georgetown. He wrote one of the first papers on audio attacks, which he titled “Cocaine Noodles” because devices interpreted the phrase “cocaine noodles” as “O.K., Google.”

For some reason, it reminds me of this meme:

9 comments:

drjimMay 16, 2018 at 9:04 PM
We received one of those devices for Christmas a few years ago. It's still in the box.

I also took great pains to disable the WiFi in the new TV we bought for the living room. Samsung has no business looking at what we watch.
AnonymousMay 16, 2018 at 10:50 PM
Question for Silicon Graybeard: Why wouldn't Alexa violate wire tapping laws?
AnonymousMay 17, 2018 at 6:15 AM
Security researchers have a habit of saying that releasing information like this isn't bad because they think the bad guys have either thought of it already, or they would think of it on their own.

I read about synthesis of audio frequencies from ultrasonic on a music speaker mailing list 20 years ago. In general, the bad guys already know about the holes.

http://www.mattblaze.org/hobbs.html
http://www.mattblaze.org/papers/kiss.html
AnonymousMay 17, 2018 at 6:29 PM
The people who sell these egregious voice controlled hazards are sociopaths, and the people who purchase them want to live in a fantasy world. These groups are codependent; don't get squished between them.

The Silicon Graybeard

Special Pages

Wednesday, May 16, 2018

Hacking Alexa, Siri and Google Assistant With Hidden Voice Commands

9 comments: