Most companies start in the wrong place and ask what they can do with speech analytics. What they should be asking, of course, is what do they need to know about the conversations they’re having with their customers, what intelligence is there in those conversations and what’s the best way to get at it.
Automated speech analytics may or may not be the answer. We’ve been involved in speech analytics, automated and manual, for the last ten years and some of the things we’ve learned over that time are:
1. Don’t believe the hype
Speech analytics is not a universal panacea that will solve your business problems; it’s one of many tools that may or may not help you improve operationally. Most of what the salesman will tell you is a ‘version of the truth’ and bears little resemblance to what solutions can or cannot do. Our experience with clients in many different sectors leads us to believe that speech analytics is frequently oversold to the enterprise and as a result will always fail to produce the intelligence that sponsors and potential consumers are desperate for. It’s often seen as a magic potion that will not just offer an 85%+ accurate transcript of all conversations, but will also provide understanding of the content of those conversations in ways not previously considered. It won’t.
The ironic thing is that it doesn’t need to be “oversold” – if deployed and run intelligently and systematically, automated speech analytics can help drive performance improvement.
2. Emotions? Don’t be so STUPID
Emotion detection as currently exists in these solutions is very simplistic and relies on a combination of words, absolute volume and volume change, amongst other things. I will be bold and say that no current solution has the ability to detect emotion solely on the dynamics of the conversation. Emotions can be detected through understanding the words and phrases that your customers use when they are cross or upset and you might (or might not) be surprised by how much this can differ over customer demographics.
3. It’s expensive (or can be)
The cost of deploying the hardware to support some solutions can be prohibitive, especially if you want to start talking real time. Typically, some vendors only suggest it becomes cost effective over 150 seats because of the initial set-up costs, but this can vary hugely depending on the technology deployed. Whole-conversation transcription solutions require far more processor power than phonemic solutions, for example. Current prices for a 400-seat call centre would be in the region of £150,000, with about half of that being software, the balance as hardware and professional services. That’s just for the analytics capability and doesn’t include the underlying voice recording technology.
4. Inspiration vs. perspiration
What any technology does well is process large amounts of information in a relatively short period of time, and speech analytics is no different. What it is good at is exactly that; trawling through tens of thousands of conversations looking for examples that match the categories that have been built. The skill in deploying analytics effectively is in the category or search phrases that you build, and understanding the words and phrases that people (customers or agents) use in different conversations, whilst at the same time understanding how to make the categories specific enough to return examples purely of the sort of calls you are looking for. It won’t tell you anything that you haven’t asked it to look for.
Which is why you need people. People have to have the inspiration, have the hunches, to ask the questions, program the categories, interpret the results and work out what’s really happening. Automation can help, but not ultimately replace, human intelligence, experience and understanding.
Interestingly, there is a lot of intelligence that can be gleaned from the metadata that accompanies the calls. There are some very good speech analytics presentations that I’ve seen which are entirely to do with the non-speech data that was available. Looking at call duration and silences and ‘on-hold’ time and matching this back to call/agent ID can be very revealing. Listening to calls, for example, that last over 150% of average call time and mining these for content can highlight agent, customer, product and process problems that you may not be aware of.
7. Sampling theory is your friend
The power of sampling theory means you can get statistically robust intelligence from a much smaller sample than you might think. A properly random sample of 400 calls gives a 95% confidence level irrespective of the size of the population, meaning you actually don’t have to listen to that many calls: one of the weaknesses in the automated analytics sales spiel.
8. Accuracy rates
Quoted accuracy rates again aren’t always what they seem – 90% accuracy doesn’t mean that a solution is 90% accurate in spotting a word or phrase or that 9 times out of 10 it will spot that word. What it means is that if you build up your categories carefully and accurately then 90% of the calls returned in that sample will be representative of what you were looking for. Ironically though, the more accurate the initial transcription/detection process that the solution uses, the easier it is to build those categories, so accuracy is still important and to be fair is getting better.
9. Do you need it all the time?
One thing that speech analytic packages can do with some reliability is look at trends over time, but again it is important to realise what it is that it is ‘trending’, and again this comes back to the accuracy and success of the category building at the outset. If your initial search reports back rubbish then all subsequent searches will also report rubbish and your trends become useless. Think about getting a hosted or periodic analysis done – what are you going to gain from having 25% of all calls analysed all the time?
We firmly believe that the conversation between a customer and an agent is the single biggest driver of customer behaviour. Understanding what it is about a conversation that makes customers behave more positively can have a huge impact on business performance if the analysis is robust and representative.
We’ve worked at different levels with most speech analytics vendors including Nexidia, Autonomy (via Baceone) and Verint, and other vendors to consider include Nice and Call Miner. We believe that automated speech analytics definitely has a part to play, and that that part will get more significant as technology improves, but just be aware that we might not quite be there yet.
Duncan White is Managing Director of horizon2 – an independent analytics-based customer management consultancy. Duncan has extensive commercial experience at board level in a number of industries and prior to founding horizon2 was Head of Analytics at Verint Consulting/CM Insight Ltd.