Rory Cellan-Jones

The spinning of Spinvox

  • Rory Cellan-Jones
  • 23 Jul 09, 08:52 GMT

It's a great British technology success story, using brilliant voice-recognition software to decode your voicemail messages and turn them into text. It's growing rapidly, and will have over 100 million users by the end of this year, to the delight of its backers, who include the investment bank Goldman Sachs.

It's created hundreds of high-quality jobs in the UK and elsewhere and employed some world-leading computer scientists at Cambridge University. That, at least, was how I saw Spinvox until recently.

But in the last few days, I've been given a somewhat different picture - by one current employee, and several others who've worked for the company in recent years. Most significantly, they've told me that the central claim of the company - that it's getting machines to translate audio into text - doesn't really stand up, because most of the work is actually done in call centres dotted around the world.

I have some confidence in those claims because I've seen compelling evidence that my own voicemails have been transcribed by humans, not by machines. I have been using the service for quite some time - indeed I've found it very useful - and been moderately impressed by the way that it translates the often garbled messages left on my phone.

But I have now seen the transcript of a number of my voicemails, including one from a major technology company promising a story, and another from the BBC's Occupational Health Department. What the logs of those messages appear to show is that the messages were sent to call-centres, where workers then spent a minute or so transcribing them.

Spinvox text messageStill wishing to be convinced that it was people not machines listening to my messages, I tried another tactic. It was suggested to me that if I recorded a message and then sent it five times in a row to my mobile, then a computer would provide the same result every time.

Well, my message (which you can hear below) was deliberately stumbling and full of quite difficult words - including my rather tricky name. But every version that came back to me in text form was radically different - and pretty inaccurate.

So unless Spinvox is employing a whole lot of rather confused computers to listen and transcribe messages, it sounds like the job was being done by a variety of agents.

In order to see this content you need to have both Javascript enabled and Flash installed. Visit BBC Webwise for full instructions. If you're reading via RSS, you'll need to visit the blog to access this content.

Why does this matter? After all, Spinvox has always been clear that there is a human element in the work - though when it says it can call on "human experts for assistance", you might imagine Cambridge boffins rather than overseas call centre staff. But the fact that so much of its work still appears to rely on people simply listening and typing could have implications for its finances and its data security.

The information commissioner has asked the firm to explain why its entry on the Data Protection Register says no data will be transferred outside the European Economic Area.

Customers may ask whether they really want people listening in to private or commercially sensitive messages. The company insists all messages are encrypted and anonymised to ensure agents cannot identify its sender or recipient.

But it's the sheer cost of getting hundreds of people to do this work which is the biggest issue for the company. One of its PR men admitted to me that the basis of the business was that more and more of the work should be done by machines, rather than humans.

So part of the spinning of Spinvox has been true, in that it has enjoyed extraordinary growth by providing a service that many people have found very useful.

But is it a major technology success story, cracking the age-old problem of getting a machine to understand the human voice in all its glorious, cacophonous varieties? Not yet, it isn't.


  • Comment number 1.

    Is it any surprise that this is being done by foreign call centres?

    And I don't think it's disingenuous of them at all. Your example is easily explained as you sent it difficult words - each time the computer would have stumbled and asked for help, involving a human every time.

    I call 'entrapment'.

  • Comment number 2.

    Rory Cellan-Jones writes "..brilliant voice-recognition software to decode your voicemail messages and turn them into text."

    as you say, it's not yet that. personally, I hope the day never comes because then GCHQ and their likes will have the means to analyse all our conversations by default.

  • Comment number 3.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 4.

    Transcribing services have been available for years. These guys have just found a way to deskill it and export the job to make it inexpensive, or do I mean cheap.
    For a truly reliable service use a stenographer.

  • Comment number 5.

    This isn't about twitter? I'm confused.

  • Comment number 6.

    Interview: Daniel Doulton, Co-Founder, Spinvox: Carrier Plans May Need Further Funding (Jul 2) -

    SpinVox Paying Staff In Stock To Save On Costs (Jul 13) -

    Interview: Christina Domecq, CEO, Spinvox (Pt 1): Managing Through The Crunch (Jul 20) -

    Interview: Christina Domecq, CEO, Spinvox (Pt 2): Cashflow-Positive In 90 Days (Jul 22) -

  • Comment number 7.

    Spinvox service is brilliant - once you've tried it you will not want to live without it. Who cares if its a machine or a human being - what kind of stupid story is this - "investigative journalism"???? phleeze

  • Comment number 8.

    Spinvox's rival Nuance is using exactly the same technology. I've heard that they pass the info to call centres in the US and India. Both Spinvox and Nuance are loss making busniesses, and I am somehow surprise about the market and analyst reaction about this technology, including BBC's own internet dragon Julie Meyer, who is flattering these loss making companies, based on a failed technology. The future is speech recognition, no doubt.
    But the speech recognition needs to go back to the labs, and become a viable technology, something that Nuance, IBM and Spinvox failed to deliver for the past couple of years.

  • Comment number 9.

    I suppose it's just one of those "use it for personal calls but not sensitive ones with your work phone" issues.

    I don't mind if some worker miles away earning minimum wage knows that someone, somewhere in the UK is meeting someone called "hi it's me" at a pub at 8pm tonight...

  • Comment number 10.

    what's the big surprise?
    there is no complete set of syntactic rules for any language.
    therefore, no computer can be programmed to "decipher" any spoken / written discourse.
    ...and that's before we start talking about accent, dialect, slang.
    rory "katherine" jones' test of multiple identical texts is a simple and 100% effective one.
    spinvox must definitively prove that there is absolutely no way that personal or commercially-sensitive information can in any way be identified.
    they have not done so.
    and their word -clearly- is not to be trusted.

    my friend sends messages about his hiv-diagnosis (including, occasionally, his name) to his contacts. he will definitely not be using spinvox to translate for him.

    thanks for listening...and -possibly- transcribing :-)

  • Comment number 11.

    I've always struggled a little with Spinvox, as the text based messages don't pick up on the emotion in a voicemail, which helps you prioritise and prepare.

    Ian Hendry
    CEO, WeCanDo.BIZ

  • Comment number 12.

    "A mass-scale, user-independent, device-independent, voice messaging system that converts unstructured voice messages into text for display on a screen is disclosed. The system comprises (i) computer implemented sub-systems and also (ii) a network connection to human operators providing transcription and quality control; the system being adapted to optimise the effectiveness of the human operators by further comprising 3 core sub-systems, namely (i) a pre-processing front end that determines an appropriate conversion strategy; (ii) one or more conversion resources; and (iii) a quality control sub-system."

    Which bit of that makes you think call centres aren't involved? [Unsuitable/Broken URL removed by Moderator] lists more. This was an open secret when they started transcribing Livejournal posts more than two years ago, and hasn't been hushed up at any time since.

  • Comment number 13.

    Wow! Some posters here are pretty complacent!
    Personally I wouldn't touch Spinvox with a bargepole now until/unless they come up with some better and less-vague answers, especially when x-ref with this related news article:

  • Comment number 14.

    Spinvox's website is clear ( that "... a combination of artificial intelligence, voice recognition and natural linguistics. But it also knows what it doesnt know and is able to call on human experts for assistance."

    Some firms whose employees had been using Spinvox (and similar services) have now warned staff that commercially confidential voicemails may be picked up and heard by humans outside the company, and that the service may not be appropriate, particularly if the people leaving messages aren't aware that their content may go outside the company and the mobile provider's automated voicemail system.

  • Comment number 15.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 16.

    great piece of reporting, why has evan davis not reported this in his license fee funded chats with the company's founder? when davis said on radio 4 today this morning that it did not matter whether it was a machine or a human what was he saying, that this isn't a story? let's see whether license fee payers are interested, well done rory.

  • Comment number 17.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 18.

    If a company tells the ICO (and their customers) that their data was not going abroad and in fact it is, then they are lying, and they are breaking the law. If their privacy policy happened to include similarly false claims, that would also be criminal, because they would be trading under false pretences. In fact there would be a whole host of offences - all very very serious ones. Any company guilty of those sorts of offences should be ashamed of themselves and ought to be pilloried and prosecuted. Of course none of these things will happen. That's the UK for you. The safest place in the world to commit Data Protection fraud.
    Secondly - how do you anonymise voicemail messages? You can do what you like with the "headers" but the body content of the messages themselves contain PII - "Hi there - this message is for Jim Bloggs if he's there, Fred Smith here, could he meet me at my house at 23 Acacia Gardens, Hammersmith, or ring me at 01234567890 please?"

    When are we going to wake up and DO something about protecting the data and privacy of ordinary citizens, and take this sort of thing seriously? The report on the Today programme (R4) this morning was a joke. They almost totally ignored the appalling breach of trust (and law) that would be involved if these allegations are true, and just twittered on to each other about how exciting the technology was. They missed the point entirely.

  • Comment number 19.

    The issue here isn't that it's insecure because humans are listening to your voicemail and texting it to you. Don't fall into that trap!

    Don't get me wrong, I'm not saying it's secure, just that the transcribers don't make it any less secure. SMS messages are encrypted before they leave your phone and get transmitted across the air. However once they leave the air (i.e. once they hit the base station) they are transmitted in clear text across any number of networks, before finally being encrypted again and sent over the air to the destinatinon phone.

    If you're using SMS to transmit business critical information then some guy in an Indian sweatshop is the least of your worries. Your first point of call should be to crucify your CIO.

  • Comment number 20.

    This doesn't sound half as good as GrandCentral, now known as Google Voice. This uses machines to do ALL the transcription. In fact, I read somewhere that users can rate the quality of the transcriptions they get back, thereby helping to improve the service. To quote from the transcriptions page: "This is the only fully automated voicemail transcription on the market"

  • Comment number 21.

    Hello from the USA,

    I am the CEO of VoiceCloud. We are a US based V2T company. I have been trying to tell the public the truth about how this service works for over a year and I am pleased that someone has taken the time to reveal the real facts behind this "technology". There is no magic box or machine to replace the human ear and there won't be for sometime. We have always embraced the fact that the HPU(Human Processing Unit)is the most effective way to transcribe messages. We actually own the centers where the transcription is performed and have always been honest about it. The danger Spinvox faces is that they are not in direct control of their transcribers or their client's sensitive information. This story reminds me of the scene in the Wizard of Oz when poor Dorothy and her friends find out they have been played. Unfortunately I don't think Spinvox's story is going to end as well.

    Please see my CNET interview in the link below, from over a year ago, where I discuss the truth behind the V2T industry.


    Gerald Marolda
    VoiceCloud, CEO

  • Comment number 22.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 23.

    RCJ: "Well, my message (which you can hear below) was deliberately stumbling and full of quite difficult words - including my rather tricky name."

    Spinvox: "... But it also knows what it doesnt know and is able to call on human experts for assistance."

    Rory, playing Devil's advocate, was this a fair test? If you purposely obscured the voicemail, does it not almost guarantee that a human operator will be required?

  • Comment number 24.

    We need a service that translates Twitter messages into voicemail. Why? So we can have a blog post about it.

  • Comment number 25.

    So Spinvox's staff are pretty efficient at mangling voicemail messages...

    I wonder how well off-the-shelf speech recognition software (e.g. Dragon) would cope? Perhaps someone could organise a head-to-head between the EAL (English as an Additional Language) humans employed by Spinvox and the computerised speech recognition software, to see which is the most efficient at transcribing voicemails (or, perhaps more to the point, which method mangles the input the least...)

  • Comment number 26.

    I am not surprised. i have friends who worked for the call centre in South Africa and after a while realised that it was all a joke. Talk about misleading the public. Call centre agents were recruited by an external company to help witht his farce.

  • Comment number 27.

    Looks like another "innovative" tech company based on deception is about to bite the dust...

  • Comment number 28.

    The people who should really be concerned are the investors who poured millions into Spinvox, instead of investing in technology they have been paying the operational costs of the all call centres hosting the services. So basically if the whole things folds they will be left with nothing. I hope that they will take the Spinvox management to task for this deception.

  • Comment number 29.

    Whoa! Wait a second! Goldman Sachs is backing a company that fibs?!?!?? That's unbelievable! I mean, GS is like, the most reputable institution on Wall Street!

  • Comment number 30.

    What's all the fuss about? Spinvox is an asset in my life and human intervention is just part of the quest to provide a top quality service. I am not that naive to think that my banking details, online credit card operations and email are handled by an automated system. How then could you explain the arrest of so many online delinquents if a human were not involved?


The BBC is not responsible for the content of external internet sites