Speech analytics: what the next generation can do for you


The latest pieces of speech analytics kit don’t simply allow you to look at individual words or topics to interpret the meaning of calls, but can now gauge things like silence, stress and tempo. Jeff Gallino takes time out to reveal what that could mean for your call centre.

Companies contemplating an investment in speech analytics would be wise to heed the old adage of ‘buyer beware’.

Speech analytics technology has advanced so rapidly in recent years that there’s a real risk of buying in to a solution that, by current standards, does too little or costs too much, or both.

But by spending some time brushing up on how speech analytics has evolved in recent years, you can make an informed decision that will deliver the greatest benefit at the best cost for as long as possible.

To make an informed decision, you need to know about three things: the solutions that are still out there that are no longer up to the task; the next generation technology that’s providing the highest possible level of effectiveness and cost-efficiency; and the requirements that are driving future developments in this evolving field.

“make an informed decision that will deliver the greatest benefit at the best cost for as long as possible”

This is knowledge that will lead you to a speech analytics solution that delivers on the promise of increased call centre efficiency, better agent performance, and more effective sales and marketing efforts.

With that in mind, let’s look at the evolution of speech analytics and what it can tell you to help you make the best buying decision for your organisation.

Stage 1: When words are no longer enough

Speech analytics was originally developed as a way to automate and speed up the otherwise tedious manual process of call monitoring.

The earliest solutions involved word spotting, which involves searching for specific words in calls to try to identify relevant content. The problem with word spotting is that it tells you nothing of significance other than that a particular word appeared.

Say you’re operating an airline call centre: If you just search for the word ‘reservation’, you have no way of knowing whether a caller wanted to make a reservation, or complain about a lost reservation, or simply wanted to find out if a flight is going to arrive in time to keep a dinner reservation.

Despite the obvious shortcomings, several companies still use word spotting to monitor calls. Others have evolved beyond it to an approach called ‘topic identification’, which attempts to categorise calls by searching for phrases that signify a particular topic.

This is something of an improvement, but it still doesn’t give you the context you need to accurately interpret the content of calls. And, like word spotting, it relies on searching for particular words and phrases, so you find only what you are looking for, and never uncover unexpected content that might be meaningful.

Both techniques are also limited in that they involve monitoring only a small sampling of calls, since monitoring all calls would be cost-prohibitive.

Stage 2: The emergence of the next generation

Today, speech analytics has evolved to make it possible to accurately interpret the meaning of calls by going beyond words and topics to capture variables such as tempo, silence and stress. In this way, next generation solutions incorporate the acoustical attributes of speech and the emotional content of calls to provide richer, more comprehensive material for interpretation.

Next generation speech analytics also uses advanced call processing technology to bring in metadata, such as information from customer relationship management (CRM) and other systems, to further complete the picture.

Finally, next generation technology makes it possible to process more calls more quickly (without additional hardware investment), so that it’s no longer necessary to settle for a small sampling.

All of these advances allow us to determine – for the first time – exactly why customers are calling, how agents are responding and what is happening as a result: information that organisations can apply to business improvement.

For example, one communications company used next generation speech analytics to uncover a billing problem that was beginning to generate angry calls from customers – calls that agents were poorly equipped to handle since they weren’t aware of the issue.

With the information gleaned through speech analytics, the company was able to fix the underlying technical problem before it could affect more customers. The company was also able to train agents in a timely way on how to respond to calls about the problem, reducing the potential for impact on overall customer satisfaction.


Stage 3: The real story behind real time

Once you have a basic understanding of how speech analytics has progressed to this point, you can begin to make intelligent assessments of the solutions that are available to you and the capabilities they offer.

The next step is to look at what’s to come. In that light, perhaps the most talked-about development today is real-time speech analytics.

Real-time seems like the next logical frontier for speech analytics. After all, what could be better than being able to automatically detect customer dissatisfaction, for example, at the very moment that it occurs, and then have a supervisor respond on the spot?

However, I think that if you stop to consider all the implications of real-time speech analytics, you have to concede that it may not be such a good idea. For one thing, supervisors stepping in upon receiving an alert has the potential for the opposite effect of what’s intended; the interruption eliminates all possibility of a good outcome without intercession, and the associated disruption could be annoying enough to actually make a bad situation worse.

But an even greater problem with real-time approaches is that they do nothing to track what’s going on over time and identify trends that require a response. You may be able to step in and soothe an angry customer, but you won’t be able to see over time why customers are getting angry in the first place and make decisions about how to solve the problem.

In my view, using real-time analytics as a tool for customer satisfaction is rather like constantly swatting at flies instead of finding out where they’re coming from and eradicating them in one fell swoop.

Putting it all together

So what does the evolution of speech analytics have to teach you about how to use speech analytics solutions to your best advantage? I think the main lessons to take away are:

  • Avoid investing in early-stage technology that has been superseded by more-effective next-generation developments.
  • Look for next generation solutions that provide a complete picture of call content and what it means, and that don’t settle for mining just a sampling of calls.
  • Don’t blindly buy in to the latest trend – whether real-time analytics or something else – without doing due diligence on how it works and what it can or can’t do for you.

Follow these simple guidelines, and I believe you’ll make a decision about speech analytics for the call centre that you will be satisfied with both now and in the long-term.

Jeff Gallino is CEO and co-founder of technology firm CallMiner
Tel: +1 239 689 6463
Website: www.callminer.com

Author: Jonty Pearce

Published On: 28th Jun 2007 - Last modified: 19th Dec 2018
Read more about - Technology, ,

Follow Us on LinkedIn

Recommended Articles

An Introduction to... Contact Centre Analytics
Wooden cubes with speech bubbles linked to each other with lines.
Speech Analytics 101: What Is Speech Analytics?
A photo of someone presenting analytics findings
A Checklist for Implementing... Speech Analytics
Graph in speech bubble - speech analytics concept
5 Benefits of Call Centre Speech Analytics
1 Comment
  • Just a quick note to pass comment in what Jeff says about real-time speech analytics.

    I feel Jeff misses the point and value of real-time analytics. It’s not so much about flagging issues to a supervisor, so that they might ‘step in’. Far better that an agent is able to ‘self-police’ and correct any issues that occur whilst the call is still in progress.

    There is an abundance of research which clearly demonstrates that the more there is a ‘conversation’ between an agent and a customer, the better the call outcome will be. It’s important to have an interaction with the customer that doesn’t give the impression that an agent is simply reading from a script.

    So we can all agree that some script deviation is a good thing and that we should allow agents freedom to hold quality conversations. But the fact remains that some key things must (or should) be said in the call by the agent. Such ‘mandatory phrases’ are commonplace.

    “This call may be recorded for …….”
    “The interest rate can go down as well as up ……”
    “Your home may be at risk if you do not keep up payments ……”

    What Real-Time speech monitoring achieves is the ability to feedback to agents that such things have been said (or not said) whilst the call is still in progress. Moreover, what the customer says can also be monitored in real-time:-

    “I’d like to speak to a supervisor….”
    “This is the sixth time I’ve called about this ….”
    “You’re not listening to me ….”
    and so on.

    The value and power of this should not be underestimated. Particularly in the world of outbound, call centres go to a lot of trouble and not inconsiderable expense to reach out to the customer. When they encounter a customer who can/will engage, then why allow for the chance of that agent not saying the right thing?

    Put simply, if there is anything going on in the dialogue that could be considered as the call ‘going wrong’ then by far the most effective way to deal with that is to put it right whilst the call is in progress. The best person to do that is the agent themselves. They just need the information feeding back to them, LIVE, about what they might have missed or said incorrectly.

    To correct these issues after the call is over is of course possible and many call centres do that through subsequent manual evaluation of the recordings. But that is clearly an ‘after the event’ task and the chances are that any attempt to put the mistakes right once the call has long finished will meet with minimal success.

    And why stop at monitoring just what the agents say? It is possible to monitor what is being said by the customer and map that to what the agent says, to be sure that the agent responds appropriately. Again, not after the event, but during the call.

    Where there are FSA and similar compliance issues at stake, the ability to make sure that no agent responds in the wrong way to a customer question is invaluable.

    For example, where a customer asks a question that can be considered to be asking for advice, then it can be mighty important that an agent does not give advice. They are required to give a specified response. With real-time monitoring, this can be flagged up (to the agent, the supervisor, whomever) whilst the call is in progress and the appropriate reply can be given. Thus ensuring compliance.

    To simply scan the call after it is concluded can certainly flag that there was a problem, but how much effort will go into putting that right?


    Robert Denbeigh 25 Jun at 00:27