Voice recognition is the process of converting a voice into digital data. Voice recognition, sometimes referred to as voiceprint, is a biometric modality that offers a contactless, software-based solution that is regarded as one of the most convenient biometric authentication solutions.
Voice recognition is the identification and authentication of a person based on the sounds they make when they speak. Voice recognition software can measure the unique biological factors that make each voiceprint unique.
Voiceprints can be measured passively, as a person speaks naturally in conversation, or actively when a person is required to speak a specific password for example.
Whilst active authentication using a spoken password can speed up the authentication process, it is not required to identify and authenticate a person’s identity. Instead, voice recognition measures the minutia of the voice and is not reliant on a spoken code or passphrase.
Voice recognition is different to speech recognition. Speech recognition is a user interface technology that allows people to interact with technology controlled by speech. Speech recognition allows for a hands-free, contactless experience and is sometimes referred to as voice command.
Home assistants, as well as smartphones, use speech recognition to carry out actions based on voice commands. Some speech recognition interfaces are also integrating voice recognition to identify different individuals. Siri, Apple’s speech recognition software, introduced this feature in 2015 to identify different people individually when commands were given.
How does biometric voice authentication work?
Voice recognition authentication works in a very similar way to other biometric authentication systems such as facial recognition or fingerprint recognition. The voice recognition system will enrol a person by creating an initial template based on a sample of the person’s voice. Sometimes, a system will record a number of samples that can be merged to deliver a more accurate template.
Like other biometric authentication methods, this template is not simply a recording of a person’s voice. It is not something that can be stolen. The template is proprietary to the voice authentication system and is a mathematical representation of the person’s voice.
The original voice recordings are discarded at this stage and the mathematical representation (“template”) is used for matching the person’s voice for authentication purposes. It is impossible to interpret or even read this template without the vendor’s secret, proprietary algorithm to decode it.
Once a template has been created, a person’s identity is authenticated by matching a voice sample against the original template. A strong match between templates indicates that the same person spoke both samples, therefore verifying the person’s identity.
This form of voice recognition authentication is called one-to-one matching.
Another form of voice recognition authentication is one-to-many. Using this technique, a voice sample from an unknown identity is compared against multiple enrollment templates to try and find a match. This is typically referred to as speaker identification, however, there are limits to accuracy using this method and its use will depend on the application.
How secure is voice recognition?
As we have already touched upon, voice recognition is an extremely secure method of authenticating that people are who they say they are. Like all forms of biometrics, your voice is something that you always have with you – it can’t be forgotten or stolen like a traditional password, and this makes it both more practical and more secure.
Liveness detection is an important additional layer when it comes to voice recognition security. Liveness detection is used in conjunction with a range of biometric authentication methods including facial recognition and fingerprint recognition to detect criminals that are attempting to “spoof” the system.
When it comes to voice recognition, the most common way to “spoof” a person is to use an audio recording of them or synthetic speech tools. Leading providers of voice recognition software integrate liveness detection into their voice recognition software to ensure that the speaker is a real person rather than a recording or synthetic voice of an individual.
One security advantage of voice recognition that sets it aside from other biometric authentication methods is that it doesn’t have to be a one-and-done method of authenticating an individual. With most authentication systems, a person is required to “sign-in” using a biometric authenticator and they can then carry on with whatever it is they need to do.
In the case of voice recognition, let’s say someone logs into a phone call (for online banking for example) and they are authenticated using their voice when they first join the call. At this point, another person could take over the call once authenticated, and this could lead to potential criminal activity (especially with online banking).
Modern-day voice biometric systems can use continuous authentication, where relevant, to continuously check in to make sure the person on the call is still the person that was originally authenticated. This is an important added layer of security for certain sectors, specifically banking and contact centres.
How accurate is voice biometrics?
Voice biometrics is also seeing significant improvements when it comes to accuracy. In March 2021, Biometric Update reported that leading voice recognition vendor ID R&D had announced gains in the accuracy of its voice biometrics, with a 0.01 per cent false acceptance rate (FAR) and a 5 per cent false rejection rate (FRR) for device unlocking through biometric authentication.
The report goes on to state, “The company says that up until now, the voice modality could not meet the security standard for mobile device or laptop unlocking, relegating voice to the position of a useful convenience for a limited range of applications. The increased accuracy level, however, now rivals a PIN, according to ID R&D, opening up new practical applications for voice.”
As the accuracy of voice recognition software improves, the opportunities for hands-free unlocking and operation increase.
ID R&D Chief Scientific Officer, Konstantin Simonchik goes on to say, “As voice becomes the de facto standard for interacting with everything from our televisions to our cars, biometrics emerge as the most convenient way to quickly identify users for security and personalization.”
Advantages and disadvantages of voice recognition
It is clear that voice recognition is still an emerging technology and when it comes to biometric authentication, especially when compared to facial, fingerprint and iris recognition, it is undoubtedly becoming an attractive option for a wide range of applications, especially as we move towards a more contactless society in which we look to reduce the number of contact points a person needs to make each day.
Artificial intelligence and machine learning are both helping to improve voice recognition systems and accuracy rates are continuing to improve, already making voice recognition a more viable solution across a number of sectors.
Here are some of the advantages and disadvantages of voice recognition technology:
- Enhance the customer experience with fast, frictionless authentication.
- Improve security and minimise breaches due to compromised or lost passwords.
- Instantly identify users and personalise the interaction.
- Automates an interaction between a business and customers.
- Increase productivity.
- Free up the IT staff time spent verifying users and updating passwords.
- Use as part of a two-factor authentication process to increase security without adding effort.
- Still some issues around accuracy rates and recognising that people are who they say they are.
- People’s voice can change over time and can also be impacted by illness e.g., flu or the common cold.
- Can be difficult to authenticate in loud environments with lots of background noise.
- Requires liveness detection to ensure a speaker is a real person and not a recording.
- Not as accurate as other biometric modalities such as facial recognition.
Who uses voice recognition?
Banking and finance is the number one industry for voice recognition biometrics. Large banks around the world have already turned to voice recognition as a way to speed up the ID verification process for call centre customers.
Banks including HSBC, Barclays, Fargo, Tangerine, and Santander are already using voice recognition software to authenticate customers. Barclays were one of the early adopters and found that 93% of customers endorsed the new voice-first security system which led to a twenty second saving in the time it took customers to verify themselves when calling the bank’s call centre.
Tatra Banka in Slovakia was another early adopter of voice recognition. First introduced in 2013, and now with more than 250,000 registered customer voice samples (one-third of the whole customer database of the bank), the average time of identifying customers has been reduced 66 per cent – to an average of just 27 seconds per customer. Now 85 per cent of all calls to the bank’s contact centre that need authentication are verified by voice
This time reduction has resulted in significant efficiency increases and fewer operators required to provide the same level of service, which enabled the bank to focus more on active sales. It also increased the bank’s Net Promoter score by 62 per cent after three months using voice biometrics.
Earlier this year, HSBC reported that their caller identification programme Voice ID had cut banking fraud by over 50% during the last year, demonstrating the value of biometrics in the fight against scammers.
And according to a report on Finextra, “The UK bank reckons its voice biometrics system has prevented almost £249 million of customers’ money from falling into the hands of telephone fraudsters in the last year.”
It is not just the banking and finance sector that is using voice authentication. Some of the other sectors using voice include:
- Retail and e-Commerce
- Law Enforcement
- HR and Marketing
Voice recognition is a growing biometric modality and one that is developing more practical uses across a wide range of industries. As we move to a more contactless society, biometric solutions like voice recognition will play an even more integral role in our day to day lives.
With improving accuracy rates and excellent security features, voice recognition is providing a highly convenient alternative to passwords, for many businesses and for many purposes.