Digital personal assistants, such as Amazon’s Alexa and Apple’s Siri, are sexist, according to one expert.
They struggle to understand quiet and ‘breathy’ voices of women compared with the deeper voices of men.
The software is often developed with the help of male voice examples and so lacks a deeper understanding of female commands.
The comment was made by Delip Rao, the CEO and co-founder of R7 Speech Sciences, a company that uses artificial intelligence to understand speech.
Mr Rao explained that the fundamental frequency of a person’s voice is what is often perceived as the pitch.
‘This is also called mean F0. The range of tones produced by our vocal tract is a function of the distribution around that.
‘We know the mean F0 for men is around 120Hz and much higher for women (~200Hz),’ he said on his website.
As well as the pitch difference between the sexes, females tend to be quieter and have more ‘breathy’ voices, says Dr Rachael Tatman, a data scientist at Kaggle, the Google-owned data science firm.
She explained to The Register that the learning algorithm in the AI voice recognition devices does not inherently prefer men, it is just that men provide a better signal.
She said: ‘There’s a slightly less robust acoustic signal for women, it’s more easily masked by noise, like a fan or traffic in the background, which makes it harder for speech recognition systems.’
Audio signals are converted into MFCCs (Mel-frequency cepstral coefficients) by the machines.
These MFCCs allow computers to process and understand the voices of humans and are commonly used in automated speech recognition models.
Dr Tatman believes that ‘there’s nothing about MFCCs in particular that are less good about modelling women’s speech than men’s.’
Despite this, the likelihood for a female’s voice to get drowned out can make it harder for the machine to understand the commands.
As well as the audio make-up of the voices being different between genders, the way the machines are taught could be at fault.
In order to develop voice-recognition ability in machines, a data-set of people is used to understand what they are saying.
Unbalanced data sets can result in bias toward a certain demographic.
‘Deep learning, in particular, is very good at recognising things that it’s seen a lot of.
‘And if you’ve trained your system on data from 90 per cent men and 10 per cent women (unlikely but possible, especially if you’re not accounting for gender in your training data), you’ll end up being very good at recognising male data and very bad at recognising female data.
‘More worryingly, this also applies to things like race and ethnicity, where there isn’t an acoustic reason for one group to be harder to understand,’ Dr Tatman said.
A lack of diversity in the training phase of the AI has resulted in bias towards white men with a Western accent.
Recent studies have found that facial recognition systems are better at identifying males compared to females.
The same software is also more adept at recognising white faces over black faces.
A study, called ‘Gender Shades,’ has found that facial recognition may not be working for all users, specially those who aren’t white males.
A researcher from the MIT Media Lab discovered that popular facial recognition services from Microsoft, IBM and Face++ vary in accuracy based on gender and race
The study found that when the person in a photo was a white man, the facial recognition software worked 99 per cent of the time.
But when the photo was of a darker skinned woman, there was a nearly 35 per cent error rate.