BBC BLOGS - dot.Rory
« Previous | Main | Next »

Spinvox investors got £600 from Nuance deal

Rory Cellan-Jones | 11:12 UK time, Tuesday, 12 January 2010

A couple of weeks ago, the American speech-recognition firm Nuance paid $102.5m - around £64m - for the British voice-to-text firm Spinvox, and we speculated that most of the investors would get very little from the deal. Now I've found out just how little.

Laptop and phoneThe people who had poured in over £100m to a company which claimed that it had world-beating technology ended up with just £600. No, not £600 each - that was spread among the owners of 5.3 million ordinary shares and 1.9 million A shares.

Last week, employees who owned stock options received a letter from Spinvox giving them the sad news about the value of their stake in the business following the sale. Here's an extract:

"Under the Transaction, a total of around £600 was paid for all the shares, with holders of A shares being paid in priority to ordinary shareholders. Ordinary shareholders received no payment for the transfer of their shares and therefore any of the options that you held under the above Plans were "underwater" immediately prior to the Transaction, that is, the exercise price of the options was higher than the value of the shares that you would acquire on exercise of the option. This means you would not have realised any financial value from your options at the time of the Transaction."

The likes of Carphone Warehouse and Goldman Sachs have probably not even bothered to collect their few pounds. As for Spinvox's co-founders Christina Domecq and Daniel Doulton, they also appear to have got nothing. So who got that £64m - or at least the £40m or so that was in cash rather than shares?

The answer is that it went to pay off the emergency loans that Spinvox received as it struggled to stay afloat last summer. It's quite the most spectacular destruction of shareholder value I've seen since, ooh, the dot.com collapse of 2000.

I've also learned something interesting about Nuance - just like Spinvox, it relies on human beings to convert voice messages into text. Nuance uses FocusMT, a company based in Bangalore in India for its healthcare products, which transcribe voice notes made by doctors.

It bought the Indian company in 2007, and says it has since diversified its business into all its speech-related activities, including voicemail transcription.

When I got in touch with Nuance, the company stressed that it had always been open about the involvement of humans: "We have call centres based in India and we've talked about that for years," said John Pollard from Nuance.

But he insisted that there was a high and increasing level of automation, with a large number of messages that did not need quality checking by human agents.

Nuance is currently looking at the call centres used by Spinvox around the world, and working out which of them it will need to keep on.

There are data protection issues here for Spinvox customers. Not only do all their existing messages now reside with Nuance - though I'm told that data will not be taken off to the United States - but their new messages could be going over to India for transcription.

The Information Commissioner's office tells me this is OK, as long as Nuance tells customers about it.

Spinvox insiders tell me the firm had made rapid progress in the last few months towards automation. But it still appeared to be losing money hand over fist - otherwise it would not have had to seek a sale to Nuance on such poor terms.

The question now is whether a combination of the British and American technology can produce a voicemail-to-text business that actually makes money.

Comments

  • Comment number 1.

    These companies with their voicemail->text services provide the perfect example of a solution looking for a problem. Now if they can expand into interactive media, and provide subtitles on-the-fly for online videos, music, and other voice content, that MIGHT be worth more than £600.

  • Comment number 2.

    I don't think there is anything magical about the collapse of Spinvox. It's the same story as many a year ago where investors piled in the cash with a hope of a return which just didn't materialise. I feel sorry for the staff who must have been very undecided if they should have jumped ship or taken some of their salary in shares. I doubt this will be the last 'burst' in 2010, it's more of a question how who's next.

  • Comment number 3.

    Nuance have been working on speech technolgies (both speech recognition and realistic speech synthesis) for more years then I care to remember, and they have bought up numerous promising technology companies over the years. If after all that, they still need humans to transcribe speech, then we are indeed looking at a very difficult problem to solve with computer science. Perhaps it cannot even be done?

    There is a lot of interest in the idea, but of course it has to be done at the right price, and I can't believe that even off-shore call centres can fit the bill. Both GoogleVoice and (more recently) Ribbit Mobile have been trialling their own speech-to-text software. It's too early to comment on Ribbit, but I know from experience that GV is a bit ropey. Better than nothing at all though, and to be honest I'd rather have a ropey translation done by a 'bot than have all my voicemails listened to by anonymous humans.

  • Comment number 4.

    Speech recognition is possible with publically available software and a reasonable computer like a 3year old notebook computer.

    http://htk.eng.cam.ac.uk/ is a package for producing the models of speech sound sequences that can be used to recognize words.

    The difficulties are :

    - doing it very accurately ie. 95% + - which means that you get 1 word in 20 wrong ... is this acceptable? No no no... one word in about 80 is probably fine (a mist steak every 3 messagaes is probably not too annoying) so you have to aim for 98%, those last 3% are going to be hard hard hard.

    - doing it in a range of environments, for example a range of microphones on mobile phones, a range of contexts of use - walking fast and breathing heavily, in an office... Suddenly 95% looks hard hard hard.

    - doing it cheaply, if you have 2 seconds of processing time per message and you are getting 500k messages a day you need 1 million seconds of processing time, which means that you need 12 servers running 24/7. Now realistically this is still far too low, as you can't have any outages so you need some nice backup, and the traffic is really bursty, so to maintain quality of service you are probably going to have to have 50 or 60 processors going. Doesn't sound much... but then again the value of transcribing a message is .. very, very low... Perhaps you are better off with 50 or 60 people on the phone? Perhaps you would build a mix; messages classified with a high confidence on the machine, fail over with the people, a stricter mix when you have a burst... This is the MBA solution, the real solution is better technology, but they teach you that technology is the wrong answer on an MBA because you can't control it's development (due to the hard sums and so on).

    - dealing with regional accents. It's ok to recognise home counties english, but if that's all you can do you are going to annoy many many people who don't talk like that. Add in other languages like spainish and the regional accents there and you can see that this is a big challenge.

    Would you invest in the research that is necessary to sort this out... no.. because Nuance is already there, and as 2 seconds becomes 1 second (in 18mths) and 0.5 seconds in 3 years and 0.25 seconds... well you get the point. If you thought that you had a gamebreaking advantage that moved you 3 years ahead of the curve then you would build a company around that. If it then turned out that that didn't play out in the real world you would loose your shirt. So I don't think that any commercial lab in the world is focusing on this at the moement. People will say "microsoft and google" - but they are not really commercial are they? (ie. they are sitting in a lake of cash and doing what they like) Which means that if a gamebreaking bit of research is actually done (it might be) then it will probably be done at somewhere like Cambridge, and it will be published due to all those pesky academic principles that people like that have, and then it will be read... by the people who live at Nuance!

    One other thing that would make me very nervous in this game is the growing power of the processors in people's pockets. Quite soon I think that this particular application (voicemail transcription) will migrate to the edge device. The only way that will not happen is if the data required to successfully do it is not available to edge device makers - or app builders, but since there are public domain libraries to do it : http://www.univie.ac.at/voice/page/corpus_description (there is also http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html%29 I don't think that this is the case.

    So - it's perfectly possible, it's just not a great business to be in, which means that we will get it and all the nice benefits that we all enjoy from not having to dial a number and being able to translate what the french fella said in his message, but while we could have had it 5 years ago if it wasn't for the pesky business of capitalism we will have it from now in varying degrees and totally at some point in the future when the costs get vanishingly small.


  • Comment number 5.

    With a name like spin-vox it's not surprising.

  • Comment number 6.

    As a shareholder in several companies (but not Spinvox), I'd rather see the company bought out and have no payment than watch the company go down the drain and lose all my shares. At least they'll still have paper value and may be worth selling later on.

  • Comment number 7.

    I don't think you can _lose_ money hand over fist, you know....

  • Comment number 8.

    Thank you Rory for writing about this. I am certain none of the Nuance's existing voicemail customers, currently the AT&T and Vonage users know a thing about the manual transcription side of the business and the fact that it is done in India. The trick here is Nuance has informed the carriers like AT&T and Vonage but they in turn have not informed their customers about it so everything is under wraps. This vital piece of information which will help the customers make an informed choice as to whether or not they want to let their messages sent to call centers in India which opens up a Pandoras box of potential data theft opportunities or take the alternative route of using Google Voice which is 100% automated. I think a company as big as Nuance should have the guts to call a press conf and let its customers know that "We do not have the technology to convert voicemails to text so we are using Indian call centers instead." If they are not doing that they are cheating their customers and its not long before they start walking on the same road which Spinvox did. I think its high time Nuance's shareholders make a note of this and take a call on whether or not they want to hold on to a potentially dangerous stake. Going by the way Nuance has been making wrong investments all over the world, and I doubt any of their investments are paying off, its not long before Nuance goes down the drain. The acquisition of Philips Speech Recognition Systems sparked an antitrust investigation by the US Department of Justice so this is not the first time that Nuance is doing something unethical. I feel Nuance should just hire some good researchers from Cambridge or Massachusetts and stick to developing their own product than take the cheap way of finding shortcuts by buying companies here and there.

    March 31, 2006 — Dictaphone Corporation, of Stratford, Connecticut — $357 million.
    April 24, 2007 — BeVocal, Inc. of Mountain View, California — $140 million.
    August 24, 2007 — Tegic Communications, Inc. of Seattle, Washington — $265 million.
    September 28, 2007 — Commissure, Inc. of New York City, New York — 217,975 shares of common stock.
    May 20, 2008 — eScription, Inc.of Needham, MA — $340 million plus 1,294,844 shares of common stock.
    September 26, 2008 — Philips Speech Recognition SystemsGMBH(PSRS, a business unit of Royal Philips Electronics of Vienna, Austria for about 66 million euros, or $96.1 million.
    October 1, 2008 — SNAPin Software, Inc. of Bellevue, WA — $180 million in shares of common stock.
    April 10, 2009, — Zi Corporation of Calgary, Canada for approximately $35 million in cash and common stock.
    July 14, 2009, — Jott Networks Inc. of Seattle, WA.
    October 5, 2009 — Ecopy of Nashua, NH. Under the terms of the agreement, net consideration was approximately $54 million in Nuance common stock.
    December 30, 2009 — Spinvox of Marlow, UK for $102.5m comprising $66m in cash and $36.5m in stock.

  • Comment number 9.

    If I subscribe to a voice to text service, then as long as it converts voice to text, I'm happy. I don't care if the translation is performed by computers, people, or fairies. Whether the translation is done here or in India isn't relevant either, as long as the data is protected properly and that has nothing to do with location. Let's face it, there have been plenty of instances of leaked data in the UK too. Our government does it systematically.

    Was shareholder value destroyed here? Yes, but this was venture capital investing. Building a business from the ground up is exceptionally hard work, and highly risky. You don't invest in this sector unless you're prepared to accept that. More often than not, it just doesn't work out the way you hoped. But they tried to build something big, and I applaud that entrepreneurial spirit. Let's just hope they take what they've learned and try again.

    Note also that Nuance paid $102m for the company, so they saw something valuable. And some investors got some money back, customers kept a service they liked, and many SpinVox employees kept their jobs. That's not an easy thing to do, so kudos to Christina and team for piloting the ship to safety. Any port in a storm, they say. Much better than being smashed on the rocks.

    As for Nuance taking "the cheap way of finding shortcuts by buying companies here and there." C'mon! Is the proposal that they do it the long, expensive way instead? That *would* be destroying shareholder value.

  • Comment number 10.

    If I subscribe to a voice to text service, then as long as it converts voice to text, I'm happy.

    But if you invest in a company that has an amazing new technology you'd be pretty pissed off if they turned out not to actually have it, yes?

    Note also that Nuance paid $102m for the company, so they saw something valuable.

    Spinvox had contracts with mobile networks; now Nuance has them - that's the value.

  • Comment number 11.

    "But if you invest in a company that has an amazing new technology you'd be pretty pissed off if they turned out not to actually have it, yes"

    Indeed. But I'd also be stupid and negligent. Investors always do what's known as "due diligence"; verifying that what the company says is true has reasonable basis in fact. Their mobile network customers will also certainly have run trials and poked around in the technology to check what they were buying.

    Entrepreneurs often oversell what they have, or just don't see the obstacles in the way (this is what allows them to attempt the impossible). But they rarely just lie. Even if they do, investors find out, and then the culprits get fired. But Christina Domecq was still running the ship last time I heard. A more plausible story is that SpinVox just found the problem harder to crack than anybody originally thought and it was better to sell the company than to invest the money required to push onward.

  • Comment number 12.

    This is what Nuance says about its cutting edge technology
    "The solution is based on Nuance’s state-of-the-art Dragon Naturally SpeakingTM (DNS), the world’s leading speech recognition engine backed by 400 patents worldwide and proven by millions of users and in-house human transcriptionists, ensuring the highest accuracy and privacy."

    Ensuring privacy?? How by sending the voicemails to be transcribed by Indian call center agents?? Is that the state-of-the-art technology Nuance was talking about all these years?

    "Nuance offers the quickest alternative to full automation of a voice transcription service."

    And we all thought that Nuance was the world leader in Speech Recognition!!

  • Comment number 13.

    A quick comparison of voicemail services, taken from http://www.nytimes.com/2010/01/14/technology/personaltech/14smart.html

    Original: “I love you, Rosalita. Tell your dad we’re getting married.”

    Transcribed by Google Voice (within 3 minutes, fully automated): “I love your broza lead. I saw your dad. We’re getting married.”

    Transcribed by AT&T (more than 5 minutes, done by Nuance's award winning Indian voicemail transcriptionists) “I love you Roselida(?). Just tell you that was [ ...].”

    Transcribed by PhoneTag (within 5 minutes, done by US in-house transcriptionists): “(I love you?), (Rosalita?). Dad were getting married.”

    What I was trying to stress here was that if you want to use humans, its fine, use some quality human transcriptionists from US, UK, Australia, NZ who are native language English speakers but not by some cheap Indian undergraduate call center workers.

 

BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.