Nine things they don’t tell you about speech analytics


Most companies start in the wrong place and ask what they can do with speech analytics. What they should be asking, of course, is what do they need to know about the conversations they’re having with their customers, what intelligence is there in those conversations and what’s the best way to get at it.

Automated speech analytics may or may not be the answer.  We’ve been involved in speech analytics, automated and manual, for the last ten years and some of the things we’ve learned over that time are:

1. Don’t believe the hype

Speech analytics is not a universal panacea that will solve your business problems; it’s one of many tools that may or may not help you improve operationally.  Most of what the salesman will tell you is a ‘version of the truth’ and bears little resemblance to what solutions can or cannot do. Our experience with clients in many different sectors leads us to believe that speech analytics is frequently oversold to the enterprise and as a result will always fail to produce the intelligence that sponsors and potential consumers are desperate for. It’s often seen as a magic potion that will not just offer an 85%+ accurate transcript of all conversations, but will also provide understanding of the content of those conversations in ways not previously considered.  It won’t.

The ironic thing is that it doesn’t need to be “oversold” – if deployed and run intelligently and systematically, automated speech analytics can help drive performance improvement.

2. Emotions? Don’t be so STUPID

Emotion detection as currently exists in these solutions is very simplistic and relies on a combination of words, absolute volume and volume change, amongst other things.  I will be bold and say that no current solution has the ability to detect emotion solely on the dynamics of the conversation.  Emotions can be detected through understanding the words and phrases that your customers use when they are cross or upset and you might (or might not) be surprised by how much this can differ over customer demographics.

3. It’s expensive (or can be)

The cost of deploying the hardware to support some solutions can be prohibitive, especially if you want to start talking real time.  Typically, some vendors only suggest it becomes cost effective over 150 seats because of the initial set-up costs, but this can vary hugely depending on the technology deployed. Whole-conversation transcription solutions require far more processor power than phonemic solutions, for example.  Current prices for a 400-seat call centre would be in the region of £150,000, with about half of that being software, the balance as hardware and professional services. That’s just for the analytics capability and doesn’t include the underlying voice recording technology.

4. Inspiration vs. perspiration

What any technology does well is process large amounts of information in a relatively short period of time, and speech analytics is no different. What it is good at is exactly that; trawling through tens of thousands of conversations looking for examples that match the categories that have been built.  The skill in deploying analytics effectively is in the category or search phrases that you build, and understanding the words and phrases that people (customers or agents) use in different conversations, whilst at the same time understanding how to make the categories specific enough to return examples purely of the sort of calls you are looking for.  It won’t tell you anything that you haven’t asked it to look for.

5. People

Which is why you need people. People have to have the inspiration, have the hunches, to ask the questions, program the categories, interpret the results and work out what’s really happening.  Automation can help, but not ultimately replace, human intelligence, experience and understanding.

6. Metadata

Interestingly, there is a lot of intelligence that can be gleaned from the metadata that accompanies the calls.  There are some very good speech analytics presentations that I’ve seen which are entirely to do with the non-speech data that was available.  Looking at call duration and silences and ‘on-hold’ time and matching this back to call/agent ID can be very revealing.  Listening to calls, for example, that last over 150% of average call time and mining these for content can highlight agent, customer, product and process problems that you may not be aware of.

7. Sampling theory is your friend

The power of sampling theory means you can get statistically robust intelligence from a much smaller sample than you might think. A properly random sample of 400 calls gives a 95% confidence level irrespective of the size of the population, meaning you actually don’t have to listen to that many calls: one of the weaknesses in the automated analytics sales spiel.

8. Accuracy rates

Quoted accuracy rates again aren’t always what they seem – 90% accuracy doesn’t mean that a solution is 90% accurate in spotting a word or phrase or that 9 times out of 10 it will spot that word.  What it means is that if you build up your categories carefully and accurately then 90% of the calls returned in that sample will be representative of what you were looking for. Ironically though, the more accurate the initial transcription/detection process that the solution uses, the easier it is to build those categories, so accuracy is still important and to be fair is getting better.

9. Do you need it all the time?

One thing that speech analytic packages can do with some reliability is look at trends over time, but again it is important to realise what it is that it is ‘trending’, and again this comes back to the accuracy and success of the category building at the outset.  If your initial search reports back rubbish then all subsequent searches will also report rubbish and your trends become useless. Think about getting a hosted or periodic analysis done – what are you going to gain from having 25% of all calls analysed all the time?

We firmly believe that the conversation between a customer and an agent is the single biggest driver of customer behaviour.  Understanding what it is about a conversation that makes customers behave more positively can have a huge impact on business performance if the analysis is robust and representative.

We’ve worked at different levels with most speech analytics vendors including Nexidia, Autonomy (via Baceone) and Verint, and other vendors to consider include Nice and Call Miner.  We believe that automated speech analytics definitely has a part to play, and that that part will get more significant as technology improves, but just be aware that we might not quite be there yet.

Duncan White

Duncan White

Duncan White is Managing Director of horizon2  – an independent analytics-based customer management consultancy.  Duncan has extensive commercial experience at board level in a number of industries and prior to founding horizon2 was Head of Analytics at Verint Consulting/CM Insight Ltd.

Author: Jo Robinson

Published On: 1st Jul 2009 - Last modified: 31st Mar 2023
Read more about - Technology, ,

Follow Us on LinkedIn

Recommended Articles

Graph in speech bubble - speech analytics concept
5 Benefits of Call Centre Speech Analytics
Wooden cubes with speech bubbles linked to each other with lines.
Speech Analytics 101: What Is Speech Analytics?
An Introduction to... Contact Centre Analytics
10 Things They Won't Tell You About Live Chat
  • At last, some honest, common sense regarding the reality of what speech analytics’s can do. If the market continues to oversell, it will go the way of CRM!

    Alan Weaser 3 Jul at 11:28
  • Duncan,

    This is a good article except that #7 is wrong and #8 over looks detection. It is important to realize that speech analytics is about accuracy & detection. Also, when determining sample size you need to account for the standard deviation.

    You point out in #8 that most speech analytics vendors don’t have the accuracy they claim. If you build your categories carefully you can have “90% accuracy”…. with 15% detection. So while 90% of the results will accurately detect the word or phrase it will not detect 85% of the instances that word/phrase actually happened in your calls.

    Further, the accuracy they claim is the best possible accuracy and does not tell us the variability (standard deviation) in that accuracy. For example, many systems do very poorly when dealing with accents or foreign languages.

    Given the variability in detection you need a much larger sample size if you want to have a 90% confidence interval. If you’re not going to do your sampling properly why do it at all?

    As you point out, it is important to have people that understand how to use Speech Analytics (and statistics). Do you want to be making organizational decisions based on bad data?

    Stats 101 22 Mar at 23:50
  • Im in complete agreement with Stats101.

    Point #7 is a very dangerous comment and you should really brush up on your stats knowledge before quoting specific numbers.

    You completely miss the point that in order to acheive a 95% confidence limit you need to understand first the error rate and this then in turn defines the sample size required.

    You can’t just quote 400 calls and state this is suitable….this is completley mis-leading.

    Lies, damned lies and statistics!! 21 Jul at 12:53
  • Regarding point 9, As a company who service several clients we have a purchased an analytics solution… the biggest benefit that we’ve seen has been when using the tool as a coaching aid – which requires a full time solution. By targeting agent behaviour and tailoring the coaching sessions to match the speech analytics results we have seen an improvement in customer satisfaction.

    Dylan 27 Sep at 16:05
  • I work in speech analytics with an application from one of the leading SA vendors.

    My experience with this package has been that it is a very complex software delivery in an emerging technology, and the design has not been adequately tested for user acceptance.

    The package we work with has no consistency in design–the operator controls are different in different parts of the application (one designer’s work doesn;t follow the same convention as another designer for another aspect of the package.)

    Also, the complexity of the application apparently escapes system designers–for our situation, the install was estimated to require 11 servers. A year and more than three times that number of servers later, we are still hoping to achieve our requirements for acceptable storage and capture.

    All that said, there is great benefit to be gained from this technology. Just don’t expect the vendor to always be there to help you through the rough spots–commiting to speech analytics will require more resources than you will expect, both in infrastructure and personnel.

    IntelliDude 6 Oct at 18:35
  • Hi Duncan, I have used speech analytics before and must say it is one of the best investments a company can make. It is key that you first choose the best as there is some companies that needs to do some more research before they can compete against the best! Secondly trends are good but changing behaviour is going to be the most important business driver speech analytics is going to bring to the table! It will also help to share best practice and motivate staff to hit target. I have also seen when speech analytic where implemented to one area of the business the massive increase in revenue and once remove how quickly agents fall back into old habits !

    Mr all for speech analytics 22 Sep at 22:37