Paul Weald shows us how the customer experience, application design and technology implementation all need to be aligned for speech self-service to be accepted.
We all understand that speech recognition can have a dramatic impact on call centre resourcing needs. Consider an operation that has an average call duration of 5 minutes, with the first part of the call requiring you to verify the identification of the caller. This will be done by collecting some structured information – for example name, address, postcode and date of birth, etc.
This combination of verification data is suitable for a speech application to automate and is consistent with many different types of call. Once the call purpose has been identified, then the call can be routed as appropriate. If this application were to save just 30 seconds in the agent’s call-handling time then there would be a 10% reduction in resource needs.
So why haven’t speech applications taken off?
The key question to ask is will the customer choose to use the self-service application as a genuine alternative to talking to an agent? The key failings in self-service customer experience can arise from:
- not being able to complete the task at hand – book the ticket, amend the reservation, check the status of the order, etc.
- the call taking considerably longer through self-service than it would do when talking to an agent
- perceiving that the organisation does not value your business – always pushing the customer towards the ‘cheapest’ channel for the organisation to operate.
A good reference point to understand the customer experience of telephone self-service is to benchmark the telephone activity with an equivalent process online. Identify equivalent KPIs – such as conversion rate and session time and then compare the performance both online and through telephone self-service.
Studies have shown that customers can only easily navigate a menu with 3 options. Above this number confusion begins to set in. IVR designers therefore have to limit the number of services that are offered to customers, otherwise the customer experience degrades. Offering more than two layers of menu will end up confusing the customer – they may end up several layers down and realise they’ve taken the wrong route; it’s a bit like a maze. Speech recognition applications need to model the different ways customers request the services you offer so rather than having multiple layers you only need one, leaving customers the freedom to ask directly for what they want: “I’d like to check my balance”, “Could you send me a new PIN”, etc.
With telephone self-service, the usual test of recognition accuracy is that a better than 95% of users complete their transactions, or convey their information accurately. For example, consider an application that needs to identify a user’s postal address. The usual approach in a call centre is for an agent to collect the postcode and house number from the caller, and then use postcode look-up software to find the full address. Speech recognition applications can increase the accuracy of their postcode recognition algorithms by asking a supplementary question of the caller, such as their road name. This means the automated application can combine its interpretation of the alphanumeric postcode (for instance, saying “RG40” might be variously interpreted by the technology as “RG40”, “RT40”, or “RG14”) with a road name (say, “Castle Road”) to produce a unique match with one of the three potential postcodes. Hence, the system would respond with something like, “I think your address is 17 Castle Road, Wokingham. Is that correct?”
One other subtle factor is for the application to train the user in how to provide their information consistently. Consider a financial application which has to capture fields with a nil amount. Think for a moment about all the different ways in which a caller could describe this – “nil”, “none”, “zero”, “naught”, “nothing”, “not applicable”, etc. Such a simple input could have multiple ways that the caller could provide the information. The best way to drive up the accuracy of the application is not to invest time and money in developing the application to accept such a range of inputs but rather to train the user how to state the ‘nil’ input.
One client that we worked with solved this problem by playing the caller the following message as part of the introduction to the application “When you tell us an amount of money we need you to say numbers naturally, for example, two hundred and fifty pounds. If the amount to give is 0, say zero. After you have given us an amount of money we will repeat it back to you. If we have got it wrong just say NO.” A very high level of accuracy resulted to the mutual benefit of both the caller and application designers!
The technology doesn’t work!
Let’s break this issue down into its component parts:
- does the application understand the caller right first time?
- does the application give the caller what they require and is human help available immediately if needed?
With accents playing such a very important role in understanding the speaker, it is important to select the right recognition engine for the task in hand and if necessary ensure the speech application developer tunes the system to take account of the regional spread of your customers. This is achieved through the sampling of literally thousands of conversations. This is a task for experts and you need to be sure that you have selected a provider with this depth of experience.Also if the user is struggling to complete the task at hand then it is important that they have the ability to transfer quickly to an agent. The technology solution should inform the agent (through either a CTI screen pop or ‘agent whisper’ feature) as to the steps that the user has taken within the application. This is particularly important if the identity of the user has already been verified by the application. There is nothing more frustrating to the customer than having to explain to the agent both who they are as well as what they have ‘failed’ to do within the self-service application.
The appropriate role for self-service
The good news is that there are some well proven areas where self service really does overcome these issues:
- Peak volume transactions – For example think of a customer calling to place a bet on the Grand National. Inevitably, there is a huge increase in call volumes on the day of the race itself, particularly from many once-a-year gamblers. As the start time draws nearer, each bookmaker faces a dilemma of how to handle this spike in demand. Do they provide hundreds of agents to take these mounting calls, or is there a better way, using natural language speech recognition, to respond and take bets? The answer is an increasing use of speech applications that can identify the caller (their address and contact details), take and confirm their bet and process their payment. All of which reduces wait times for callers as they no longer have to queue for an agent to become free. For the customer the transaction is time critical – they want to be able to place their bet as quickly as possible and self service offers the shortest waiting time;
- End of call for customer surveys – once a call transaction process with an agent is completed then a speech application could be used to collect feedback from the caller about the organisation or process that they have just been through. Where the purpose of the questionnaire is to collect information about the agent who handled the call, then a speech application can be a good way to collect information which the caller would be uncomfortable providing to the agent themselves. Equally, having agents spend an additional 30 seconds on the end of each call is potentially prohibitive from a resource perspective.
As lunar eclipses are not yet an everyday occurrence, then perhaps this explains why so few organisations are enjoying the cost savings benefits that automated speech technology can offer.
Paul Weald is director of the consultancy RXPerience Limited