Fascinating story about an outbound IVR telemarketing system that definitely seems to good to be true – the suggestion that this uses humans to understand the speech input and select a pre-recorded ‘canned response’ seems the most likely to me. The speech recognition seems too good at handling unexpected inputs, and the speech output is just too natural for real-time on-the-fly generation. The fact that there seemed to be no ‘I am not a Robot’ response in the recorded conversation in the article is also a giveaway.
This is a really clever use though – the ability to use agents who understand a language pretty well (but maybe speak it with an accent that detracts from the overall experience) to choose spoken responses is interesting. Now if they used industry-leading test-to-speech to generate the responses (maybe something like Nuance Vocalizer!) it could be even better and have the ability to handle the unexpected.
The use of these hybrid systems seems to be becoming more prevalent at the moment: either robots handling initial customer reactions and handing-off to humans if they need to, or humans always working real-time in the background to ensure quality of responses is maintained (as in this example). See this article as well: