“Portal, how hot is it going to be today?” That has turned into the kick-off call at our household. Every morning, around 6:30, my children aged 5, 6, and 8 years old take turns to ask the Facebook smart device how the weather is meant to be, so they can get their backpacks ready.
This has turned into yet another pandemic-life routine, a rather hilarious one, I must say. Just try and picture the sleepy kids, trying to get understood by the AI-powered screen.
Innocuous at seems, this ‘outsmarting’ tendency can present some privacy and security questions around the actual privacy of conversations or searches for information happening in the house, especially when conducted by our kids.
Something similar happens when my husband and I are chatting and casually browsing for information of the most random topics using our smartphones. We both own Samsung mobile phones, which constantly try and get us to sign up for their smart assistant, Bixbi. Or when we’re talking about our upcoming trip to Morocco and the next thing I see advertised on my Google search results are offers to fly to some of the most touristic Moroccan cities. These situations have sparked numerous debates amongst family and friends on the possibility of being ‘spied’ on by the tech giants that own all those various smart assistants and smart devices.
Looking for a reasonable explanation to either confirm or dispel this suspicion, I ran a quick research search that came up with a study from 2020 by Ruhr-Universität Bochum (RUB) and the Bochum Max Planck Institute (MPI) for Cyber Security and Privacy. They investigated which words inadvertently activate voice assistants and compiled a list of English, German, and Chinese terms that were repeatedly misinterpreted by various smart speakers as prompts.
It turns out that whenever the systems wake up, they record a short sequence of what is being said and transmit the data to the manufacturer. The audio snippets are then transcribed and checked by employees of the respective corporation. Thus, fragments of very private conversations can end up in the companies’ systems, explain the researchers at the German universities.
Researching false smart devices’ false triggers
For the project, Lea Schönherr from the RUB research group Cognitive Signal Processing, headed by Professor Dorothea Kolossa at the RUB Horst Görtz Institute for IT Security (HGI), collaborated with Dr. Maximilian Golla, previously at HGI, now at MPI for Security and Privacy, as well as, Jan Wiele and Thorsten Eisenhofer from the HGI Chair for Systems Security headed by Professor Thorsten Holz.
They tested the voice assistants created and owned by Amazon, Apple, Google, Microsoft, and Deutsche Telekom, as well as Chinese ones by Xiaomi, Baidu, and Tencent. They played them hours of English, German, and Chinese audio material, including several seasons from the series “Game of Thrones,” “Modern Family,” and “House of Cards,” as well as news broadcasts. Additionally, they fed them professional audio data sets that are used to train smart speakers.
All voice assistants were equipped with a light sensor that registered when the activity indicator of the smart speaker lit up, thus, visibly switching the device into active mode indicating that a trigger occurred. The setup also registered when a voice assistant sent data to the outside. Whenever one of the devices switched to active mode, the researchers recorded which audio sequence had caused it. They later manually evaluated which terms had triggered the assistant.
If you’re interested in the detail of the research, you can find examples yielded by the researchers’ analysis at unacceptable-privacy.github.io.
Thousands of words and sentences incorrectly triggering smart devices and AI-assistants
“The devices are intentionally programmed in a somewhat forgiving manner, because they are supposed to be able to understand their humans. Therefore, they are more likely to start up once too often rather than not at all,” concluded one of the researchers running this analysis, Dorothea Kolossa.
That’s why, depending on the pronunciation, Alexa reacts to the words “unacceptable” and “election,” while Google reacts to “OK, cool.” Siri can be fooled by “a city,” Cortana by “Montana,“ Computer by “Peter,” Amazon by “and the zone,” and Echo by “tobacco.”
Alexa – (“election,” “a letter”)
Google Home (“OK, cool,” “Ok, you know”)
Siri (“hey Jerry,” “hey, seriously”)
Microsoft’s Cortana (“Montana,” “frittata”).
Commenting on the security implications, Professor Thorsten Holz highlights that this is concerning as sometimes very private conversations can end up with strangers. “From an engineering point of view, however, this approach is quite understandable, because the systems can only be improved using such data. The manufacturers have to strike a balance between data protection and technical optimization,” he argues.
Making sure just the right words are heard: Understanding your smart devices’ settings
The majority of smart devices have privacy settings that put limits on the use of your data, such as how much the device’s manufacturer or AI assistant owner can collect such data, what they use it for, how long they can keep it, and how you–as the source of that data–can interact with it.
These privacy settings usually live within your account profile (like this one for Apple’s Siri) on the company’s website or on an associated app you had to download to start using the given device. Some privacy features, such as location services, can only be turned on or off.
In the Alexa app that works with Amazon’s Echo, for example, you can restrict Amazon from using your voice recordings for certain purposes. They may still collect this information; they just can’t use it for purposes you’ve opted out of. The downside is that for the sake of higher privacy, some products won’t work correctly or won’t deliver everything they promise if you opt out of certain data collection.
For us using Portal as the main way to communicate with family and friends abroad (the kids love being able to talk and play with their grandparents using this device -, it was all about configuring Portal securely with access codes for both the device and our account so we control who can access and use it. Mind that to log in to the Portal you need an active Facebook account, which means you need to be at least 14 years old. It doesn’t work with Messenger Kids, so calls cannot be made or received to/from accounts of that app on the device. Still, this presents some tricky questions, as, unless you upload all your contacts to your smart device or assistant, you may not even be able to call one person on the device. Same with photos: If you have a smart photo frame, you may need to grant it access to all your photos for it to display even one.
Something else you can try is adjusting the sensitivity of your Google Home smart speaker or Nest devices so they pick up commands only when you’re actually talking to them. You can also link your voice to your Google account to help it recognize you more accurately across products and services. On this note, remember that if you’ve opted into the Voice & Audio Activity setting in your Google account to teach the Assistant to better recognize your voice, Google keeps recordings of those interactions forever unless you tell it otherwise. That includes snippets of conversations it might pick up my mistake.