An Introduction to Voice Identification

Ori Akstein of NICE introduces us to the use voice identification within the contact centre.

Voice Identification, also known as voice recognition or speaker recognition, is a biometric technology that is being adopted by call centres that must authenticate each caller before they can provide service.

Voice identification uses the innate biological characteristics of a person’s voice to create a voiceprint that is unique to that person. Its biometric properties make voice identification difficult to spoof.

It’s also easier for users who no longer need to remember passwords or the answers to security questions. With voice identification, there is nothing to forget or lose. The authentication is part of the person.

Voice Identification Differs From Speech Recognition

Voice and speech recognition are not the same. Voice identification is about verifying the identity of the speaker. Speech recognition is about understanding the words being spoken.

At the outset, digital personal assistants like Siri and Alexa could understand the verbal questions they were asked, but could not verify who was asking the question.

Today, Apple’s Siri uses voice identification technology to recognize the person who is speaking. Similarly, Amazon users can create a personal voice profile and train Alexa to recognize it.

As we can see the use cases are different yet complementary. Voice identification is used to authenticate the speaker, while speech recognition is used to understand what the speaker is saying.

How Voice Identification Works

Voice identification technology captures and measures the physical qualities of a person’s voice when speaking as well as the unique biological parameters that combine to produce that voice. These vocal qualities include:

Duration
Intensity
Pitch
Timbre
Dynamics

A voiceprint is created by capturing the dominant frequencies and tones in spoken words and reducing them to a digital format that collectively creates a unique “voiceprint” for that person.

Two Main Methods for Voice Identification

Voice identification technology relies on a microphone, plus two main methods for capturing a voiceprint (i.e., enrolling a customer) and then using the voiceprint template for real-time authentication.

The easiest voice identification method is Text -Independent, where vocal samples are captured from regular speech that is random and unplanned.

Customers can say anything they want for their passphrase. In advanced voice identification systems, the customer’s voice is captured passively during normal conversation.

There is no need to formally speak and repeat a passphrase in order to enrol.

Alternately, voice identification may be Text-Dependent, where the same words or passphrases are used to capture the voiceprint at enrolment and subsequently to authenticate the customer each time. Text-dependent voice identification and authentication may be static or dynamic.

Static authentication uses the same passphrase every time. Dynamic authentication uses a randomized passphrase that is different each time.

For example, customers could be asked to say a number of pre-defined phrases at enrolment and any of these could be used for authentication. Likewise, unique phrases such as random number sequences, could be generated for each customer.

In any case, text-dependent voice identification requires active capture and enrolment of the customer.

Voice Identification – It’s not Just for Call Centres

Customer service call centres are the natural use case for voice identification. It’s a great way to authenticate customers quickly and reliably without the tedious security and authentication procedures that customers dislike.

Not surprisingly, most customers find voice identification more convenient and less risky sharing personal IDs such as driver’s license or credit-card numbers over the phone.

The rise of the mobile app is also driving voice identification uses cases. Cyber threats have prompted many applications to require two-factor authentication. A single password is not enough.

Voice identification can be paired with facial recognition to authenticate users of mobile banking, finance, and other high-security apps. Voice identification is ideal for the mobile generation because it only requires a microphone which is standard equipment in most mobile devices.

Voice identification is also a hands-free and presence-free authentication method. Other forms of biometric authentication require the customer’s physical presence to scan a finger, a hand, or a retina which may not be convenient or possible in a car or on public transport.

The Pros and Cons of Voice Identification

On the plus side, voice identification offers many advantages over other forms of user authentication as well as other biometrics:

Wide Accessibility: The ubiquity of microphones in landline phones, smartphones, tablets and other mobile devices make this method of identification and authentication highly accessible.
Convenience and Simplicity for Users: Voice identification can be used by almost anyone, including those with visual impairments or disabilities that make physical interaction difficult.
Non-invasive: The contactless nature of voice identification is more hygienic than touch-based methods, and encourages customer acceptance and use.
Cost-effective: Voice identification requires no special biometric hardware. A simple microphone is sufficient to listen and capture voice samples. Clearly it is one of the most cost-effective solutions for secure biometric authentication.

Voice identification also has some disadvantages, including:

Accuracy: voice identification is not as accurate as other biometric methods because the quality of the voiceprint and voice-match can be degraded by background noise. Also, false negatives may arise when a voice is hoarse or the mechanics of the voice is affected for any reason. Customers should not be denied access because of a cold.
Assurance: It requires live speaker verification to assure the voice is from a real person and not a recording.
Sensitivity to Noise: Voice recognition is not ideal for noisy or public spaces.

Voice Identification Makes User Authentication More Secure

User authentication has become one of the great banes of our modern, connected existence.

Customers forget passwords, can’t remember their answers to security questions, and resent being treated with suspicion when they want to access their own bank account!

Voice identification technology simplifies the authentication process and makes it more secure.

Author: Guest Author

Published On: 4th Mar 2021 - Last modified: 22nd Dec 2023
Read more about - Guest Blogs, NiCE