The Ethics of Smart Devices That Analyze How We Speak
- by 7wData
As smart assistants and voice interfaces become more common, we’re giving away a new form of personal data — our Speech. This goes far beyond just the words we say out loud. Speech lies at the heart of our social interactions, and we unwittingly reveal much about ourselves when we talk. When someone hears a voice, they immediately start picking up on accent and intonation and make assumptions about the speaker’s age, education, personality, etc. But what happens when machines start analyzing how we talk? The big tech firms are coy about exactly what they are planning to detect in our voices and why, but Amazon has a patent that lists a range of traits they might collect, including identity (“gender, age, ethnic origin, etc.”), health(“sore throat, sickness, etc.”),and feelings,(“happy, sad, tired, sleepy, excited, etc.”). This is worrisome, because algorithms are imperfect. And voice is particularly difficult to analyze because the signals we give off are inconsistent and ambiguous. What’s more, the inferences that even humans make are distorted by stereotypes. In business, we’ve gotten used to being careful about what we write in emails, in case information goes astray. We need to develop a similar wary attitude to having sensitive conversations close to connected devices. The only truly safe device to talk in front of is one that is turned off.
As smart assistants and voice interfaces become more common, we’re giving away a new form of personal data — our speech. This goes far beyond just the words we say out loud.
Speech lies at the heart of our social interactions, and we unwittingly reveal much about ourselves when we talk. When someone hears a voice, they immediately start picking up on accent and intonation and make assumptions about the speaker’s age, education, personality, etc. Humans do this so we can make a good guess at how best to respond to the person speaking.
But what happens when machines start analyzing how we talk? The big tech firms are coy about exactly what they are planning to detect in our voices and why, but Amazon has a patent that lists a range of traits they might collect, including identity (“gender, age, ethnic origin, etc.”), health(“sore throat, sickness, etc.”),and feelings,(“happy, sad, tired, sleepy, excited, etc.”).
This worries me — and it should worry you, too — because algorithms are imperfect. And voice is particularly difficult to analyze because the signals we give off are inconsistent and ambiguous. What’s more, the inferences that even humans make are distorted by stereotypes. Let’s use the example of trying to identify sexual orientation. There is a style of speaking with raised pitch and swooping intonations which some people assume signals a gay man. But confusion often arises because some heterosexuals speak this way, and many homosexuals don’t. Science experiments show that human aural “gaydar” is only right about 60% of the time. Studies of machines attempting to detect sexual orientation from facial images have shown a success rate of about 70%. Sound impressive? Not to me, because that means those machines are wrong 30% of the time. And I would anticipate success rates to be even lower for voices, because how we speak changes depending on who we’re talking to. Our vocal anatomy is very flexible, which allows us to be oral chameleons, subconsciously changing our voices to fit in better with the person we’re speaking with.
We should also be concerned about companies collecting imperfect information on the other traits mentioned in Amazon’s patent, including gender and ethnic origin. The speech examples used to train machine learning applications are going to learn societal biases. It has already been seen in other similar technologies. Type the Turkish “O bir hemşire. O bir doctor” into Google Translate and you’ll find “She is a nurse” and “He’s a doctor.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More