Rory Cellan-Jones

Spinvox sends a message

  • Rory Cellan-Jones
  • 24 Jul 09, 11:33 GMT

Our story yesterday about Spinvox - and the fact that much of its work in transcribing voicemails is done by humans rather than machines - made some waves. A number of users who had imagined that the service was all about clever software seemed mildly shocked and surprised to find out that they were often being read by call centre workers in the Philippines or India.

waves on a screenOthers in the technology world said it had been known for years so what was the big deal. But Spinvox took the whole issue very seriously. Last night it put up a blog post describing my story as "incorrect and inaccurate" - but the company also invited me on a tour of its headquarters with the chance to see the technology behind its Voice Message Conversion System.

Christina Domecq, the company's co-founder came on the radio, talking to PM's Eddie Mair. He repeatedly pressed her on our central claim, that the majority of calls were read by humans rather than machines, but she maintained the company's line that this was a complex matter with no simple answer - the system learned as it went along, so that all of her messages, for instance, would be read by machine.

Later, she told the Guardian "The ratio of humans to messages and humans to number of users is very, very low." and "The majority of calls are fully automated."

Well I can only repeat what I've been told - and seen evidence to substantiate - that the majority of calls are in fact heard and transcribed by staff in call centres.

Christina Domecq also explained to the Guardian that as Spinvox ramped up from 30 to 100 million customers worldwide, it would be simply impossible to get human beings to do the job. That doesn't quite mesh with this quote from an interview with the paidcontent site, which has written extensively about Spinvox: "When we're going through massive growth, like we are now, we need more agents," she told the site in a lengthy interview. "A lot of Latin American dialects are new for us".

So what are they saying? That massive growth means the machines have to do most of the work - or that Spinvox has to recruit more call centre staff?

Late yesterday, someone pointed me to an advert on a site called, where Spinvox appears to be seeking tenders for new call centre operations. It says that Spinvox "is currently in need of some significant support with our voice-to-text transcription services."

It outlines the nature of the work and then concludes:

"We would initially require you to provide us with c.50 agent workstations 24/7 for a 3 month trial, which if successful would lead to a 2/3 year long term commercial deal with significant ramp-up of agent resource numbers."

No very obvious sign there of a rapid move towards full automation.

And people I've spoken to in the speech recognition industry over recent days are largely of the view that Spinvox has set itself an impossible task. One firm also trying to provide a voicemail conversion service told me that if the company could really achieve the full automation to which Christina Domecq aspires, "it would be making money hand over fist." Its latest reported figures show it's a long way from that.

And Ian Turner, European MD of Nuance, the firm behind the Dragon brand of speech recognition products, told me a bit about his business. He explained a system where doctors dictate patient notes and prescriptions at high speed, and the computer-generated text is checked by a medical secretary.

That delivered very high levels of accuracy, he explained, but converting into text lots of different voices shouting down phone lines in different accents and languages was a much greater challenge: "This is serious deep engineering to build this stuff which takes years." And he said he'd seen no evidence that Spinvox were ahead of the game in this area.

Mr Turner also said that companies engaged in this work needed to be careful about data storage and transparent about their methods:"You have to be honest about this... in the current climate about data privacy, being transparent is absolutely critical."

By the way, I've written back to Spinvox accepting their kind invitation to come and see their systems close-up. But I've said the BBC would also like to see their overseas call centres. I'll let you know how they respond.


  • Comment number 1.

    The advert is from September 2007 (scroll down and look at the tenders). I'm not sure it indicates *anything* significant about recent trends towards or away from call centres, does it? A lot changes in technology in two years.

  • Comment number 2.

    This is interesting ... so Vonage is/was using spinvox?

    iphone screenshot from a vonage user .... voice to text transcriber in pakistan.

  • Comment number 3.

    As a Spinvox user for about a year now the more that i think about i believe that you might be able to tell which calls are handled electronically and which are transcribed by hand as some messages come through within about a minute and others can take about five to come through, which might be the difference between machine V human.

    If, and there is no reason to question, Christina Domecq's claim of increasing users from 30M to 100M in fairly short order is this not a case of the tech not coping with demand and Spinvox being pro-active to ensure no loss of service.

    That said, does it really matter if human transcription is required due to volume of calls or as part of the learning process as long as Data Protection rules etc and being adhered to?

  • Comment number 4.

    This whole story sounds very like the showman who exhibited a fabulous 'talking robot', only for it to be revealed as being linked to a person in another room.

    Having used leading voice recognition software, which is excellent only after careful and lengthy training, this whole service had to be too good to be true. I'm afraid that any service which offers remote processing of data - backups, etc. included - makes one extremely vulnerable to data insecurity.

    If the big credit card companies can't keep data secure 100% of the time, how can people seriously trust a start-up such as Spinvox. It's name seems pretty apt to me!

  • Comment number 5.

    Rory, if you're going on a tour of their headquarters, I recommend taking a couple of techies with you. Someone familiar with call-centre technology and another who's familiar with internet technology and general programming etc. would be extremely helpful. If their "brain" is so secret, you're extremely unlikely to see anything of any use unless you're able to press very hard, and from a technical angle.

  • Comment number 6.

    Rory, good luck in getting an invite to Spinvox's overseas call centres. Don't forget to pack the suncream and swimming trunks, as it'll presumably be on BBC expenses :-)

  • Comment number 7.

    I'm unsure if everyone isn't missing the point. I'm not a Spinvox user but I have looked at it. Can they take a voicemail, convert it and text it to me? If yes then I don't really care how they do it as long as the price stays the same.

    Granted, it would be nice to know there is a shiny piece of kit doing all the work and they should have been more open about offshoring but again, I don't really care as long as it does what it says on the box.

  • Comment number 8.

    I'm surprised if people just don't care if it's cheap third world labour transcribing their voicemails - isn't there an ethical issue or two here?
    Most subscribers reckon they're paying for the use of a clever software engine, automatically converting their messages to text.
    I don't imagine Spinvox raised its $200 million of venture capital by suggesting it would be third world call centres that was the secret of their success - they were claiming some kind of breakthrough in speech recognition software, although with the proviso that some human input was required for difficult messages.
    If even half the posts on other websites (see PaidContent)are true, especially from those claiming to be former employees, Spinvox looks more and more like a software version of the Mechanical Turk.
    It's a fascinating story though - and good luck to Rory in getting to the truth. I agree he should go to the company's HQ armed with absolutely the right technical questions to ask - and get a good look at how their voice to text engine is supposed to work.
    This is a link to a patent application for the "system":
    This is a link to the comments following an interview with the Spinvox CEO:

  • Comment number 9.

    Where is the story here? UK customers of Spinvox only have their voice mail interpreted by UK based staff. No breach here.
    Users being supplied the Spinvox service via a non UK mobile network are covered under their national data protection laws, unless we are trying to apply UK law extra teritorially. So no breach.
    It's pretty obvious that to provide a foreign language service for a foreign network may require a foreign call centre. So no story.
    As to the technology, voice recognition has always been a bit poor at best, so it may be valid to criticise Spinvox's performance but at least they try to correct mistakes.
    There is a DotCom look about the financial performance but at least they made some cash, though it may take a while for their big deals to generate revenue.

  • Comment number 10.

    Rory, You may want to take a good look at their finances while you are visiting their HQ in Marlow. I know of one supplier that is struggling to get them to pay for what they have been provided.

  • Comment number 11.

    timbelfall (#9): Where did you get the "UK call centres for UK callers" from? In the previous blog -- link at the top of this page -- we read Claims to the BBC suggest that the majority of messages have been heard and transcribed by call centre staff in South Africa and the Philippines with no mention of country differentiation. UK call centres aren't mentioned in the SpinVox blog either, even though doing so would nail the Data Protection arguments.

  • Comment number 12.

    Rory, this is an opportunity not to be missed! While you're there can you do your readership a big favour?

    Christina has said the "majority of calls are fully automated" - so we know it's more than 50%. Can you ask her please if this includes hangups, ie. when someone calls and hangs up without saying anything?

    I happen to know the answer is yes. They're not going to give you figures on what proportion this is, but it might be worth asking if the "majority" statement still stands for calls that aren't just hangups. Because you'd agree that's the interesting question: how many calls *that require transcription* are fully automated.

  • Comment number 13.

    To train a speech recognition system are callers identified and along with voice information stored on a system. Is this privacy debate not getting creepy now especially as callers are not opting into this being done with their voice.

  • Comment number 14.

  • Comment number 15.

    I worked for a law firm quite a while back.
    We had a client interested in a very similar service that pre-dated Spinvox.

    We found that virtually all of the calls were handled by human intervention. Basically used off-shore locations to keep costs down. But was promoted as a magical automated service, with a human element purely as a back up for very difficult calls.

    My only comment is that if it is mainly automated, they would have been bought very quickly by a much larger company for a very large sum of money.

  • Comment number 16.

    If it was automated, then it wouldn't cost 35 cents per 30 seconds of audio to be transcribed. See their API for more info.

  • Comment number 17.

    See for links to all articles and discussions around this breaking story.


The BBC is not responsible for the content of external internet sites