Phone call translator app to be offered by NTT Docomo


Richard Taylor reports on the phone app claiming to translate calls in real time

Related Stories

An app offering real-time translations is to allow people in Japan to speak to foreigners over the phone with both parties using their native tongue.

NTT Docomo - the country's biggest mobile network - will initially convert Japanese to English, Mandarin and Korean, with other languages to follow.

It is the latest in a series of telephone conversation translators to launch in recent months.

Lexifone and Vocre have developed other products.

Alcatel-Lucent and Microsoft are among those working on other solutions.

The products have the potential to let companies avoid having to use specially trained multilingual staff, helping them cut costs. They could also aid tourism.

However, the software involved cannot offer perfect translations, limiting its use in some situations.

Cloud technology

NTT Docomo unveiled its Hanashite Hon'yaku app for Android devices at the Combined Exhibition of Advanced Technologies (Ceatec) show in Japan earlier this month, and plans to launch it on 1 November.

It provides users with voice translations of the other speaker's conversation after a slight pause, as well as providing a text readout.

"French, German, Indonesian, Italian, Portuguese, Spanish and Thai will be added for this application in late November, raising the number of non-Japanese languages to 10," the firm said in a statement.

"Fast and accurate translations are possible with any smartphone, regardless of device specifications, because Hanashite Hon'yaku utilises Docomo's cloud [remote computer servers] for processing."

The caller must subscribe to one of Docomo's packages to be able to use it.

Graphic for NTT Docomo app NTT Docomo's app offers both voice and text translations of phone conversations
Landline translations

NTT Docomo will soon face competition from France's Alcatel-Lucent which is developing a rival product, WeTalk. It can handle Japanese and about a dozen other languages including English, French and Arabic.

Start Quote

We want to allow conferences with 10 people and four different languages, and the system would provide translations in every language needed”

End Quote Gilles Gerlinge Alactel-Lucent

The service is designed to work over any landline telephone, meaning the company has had to find a way to do speech recognition using audio data sampled at a rate of 8kHz or 16kHz.

Other products - which rely on data connections - have used higher 44kHz samples which are easier to process.

Alcatel-Lucent uses a patented technology to capture the user's voice and enhance it before applying speech recognition software. The data is then run through translation software before being run through a speech synthesiser.

The firm said all this could be done in less than a second. However, it has opted to wait before the speaker has stopped talking before starting the translation after experiments carried out with workers at insurance company Axa suggested users preferred the experience.

"We are still working on improving the system," Gilles Gerlinger, the product's co-founder, told the BBC.

"You can do conversations with one person, but we want to allow conferences with 10 people and four different languages, and the system would provide translations in every language needed.

"We also have a project called MyVoice which can have a synthetic voice that sounds like your real one."

Mr Gerlinger suggested that his firm would make money from the product by renting servers with the necessary software to big businesses, and charging smaller ones a fee for the amount of time they used the service.

Converted video chats

Microsoft's Research Labs has also been working on a technology it calls the Translating Telephone. The firm has acknowledged that one of the biggest problems was making the software adapt itself to cope with different ways people pronounce words.

Lexifone graphic Start-up Lexifone charges users for its service depending on the length of their telephone call

"The technologies are still not perfect," said researcher Kit Thambiratnam in 2010.

"But we feel they are good enough for two people to communicate in their native languages, as long as they are willing to speak carefully and maybe occasionally repeat themselves."

Google already has a Translate app that can translate 17 spoken languages, allowing face-to-face conversations with a foreigner, but it is not yet designed to work with telephone calls.

Start-up Israeli company Lexifone is hoping to get a head-start with its own phone conversation product.

It launched earlier this year offering translations between English, Spanish, Portuguese, Italian, French and Mandarin.

Its chief executive, an ex-IBM computer engineer, has ambitions to disrupt the human translation industry which he said was worth $14bn (£8.7bn) a year.

"Our original plan was for annual growth of 200%," Ike Sagie told Reuters last month.

Vocre screenshot The Vocre app won the Audience Choice Award at the Techcrunch Disrupt festival

"The way we see market acceptance and the way we see the market welcoming the technology I think we have the potential for growing faster than that."

The firm is working with BT and Telefonica to offer its service to the phone networks' customers.

Meanwhile California-based MyLanguage, is pursuing another strategy by providing voice and text translations during video chats via its Vocre app for iPhones.

The facility - which is currently being beta tested - means that customers will need an internet connection to use it.

Lost in translation

Despite the ambitions of those involved in the nascent sector, one analyst questioned their chances of success

"These kind of real-time technologies have been 'two to three years away' for the past decade," said Benedict Evans, technology expert at Enders Analysis.

"Both speech recognition and machine translation are sort of there if you're not too fussy.

"But they are generally not as good as speaking the language itself, and my suspicion is that they would not reliable enough to use them for business purposes when you need to be really sure about what the other person said."


More on This Story

Related Stories

The BBC is not responsible for the content of external Internet sites


This entry is now closed for comments

Jump to comments pagination
  • rate this

    Comment number 62.

    To 56 - the rules in a language are just the first small steps. If it was so easy, that would already have been done. This is the kind of thinking that led to the current debacle in court interpreting in the UK!

  • rate this

    Comment number 61.

    Passes through the Google translation from English to Japanese back to English, this comment is. This new mobile phone will work for a very simple sentence, nothing more than that. Unless you have spent a long time to train it, you might be the voice recognition software is translation software while it is very inaccurate, but getting better, but it is not very good anyway.

  • rate this

    Comment number 60.

    The tagline for this story 'Babel Phones' is rather odd, I'm sure it's meant to be a Douglas Adams reference, but then should be 'Babel Fish Phone', named after the biblical 'Tower of Babel', and without the 'Fish' qualification it implies someone is selling a REALLY big phone.

  • rate this

    Comment number 59.

    Interesting, but - 'The caller must subscribe to one of Docomo's packages to be able to use it'. Most UK mobile phones don't work in Japan, even in roaming mode. Handsets must be compatible with a Japanese network. Compatible handsets may be used via international roaming or a rental or prepaid SIM card from a Japanese carrier (if unlocked), while phones with WIFI can use voip like Skype.

  • rate this

    Comment number 58.

    Computers never understand what I'm saying, eventually they have to give up and give me a real person. Even the best speech recognition software focusing on a limited vocabulary can only get 80% recognition. So one in five words come out ????? or just wrong. Then when you get into figures of speech, localized phrases, etc, etc.... Good luck!

  • rate this

    Comment number 57.

    Hahahhaa..use the word "corruption" and the moderator throws out the comment. Hahahaha

  • rate this

    Comment number 56.

    This is, beyond doubt, the future. Maybe it won't be fully working in the next two years, maybe not even the next five, but human language follows rules and once those rules have been fully taught to the technology, all things with it will be possible.

  • rate this

    Comment number 55.

    The EU has spent millions over 30 years to achieve such a system. They gave up trying where the spoken word is concerned, for all the reasons others have mentioned. They have produced a major translation,not interpretation,aid, which is full of words and phrases which recur constantly in EU docs and legal acts. The versions produced still have to be revised by translators. This won't fly.

  • rate this

    Comment number 54.

    I am surprised by how many people are reacting negativley to this. The first roll out of any new software or application has its faults and foibles. We started with film, then videotape, and now digital - but had betamax and laserdiscs along the way.
    Combine a future version of this with television and you will not need subtitles ever again.

  • rate this

    Comment number 53.

    I can only think of my own experiences with the YouTube "Transcribe Audio" subtitles (they are complete nonsense but also very funny to read!).

    Accents and dialects are going to be a real problem. Japanese to English isn't good enough, you need to be able to translate Osakan and Scouse too. In any case, we're still a long way from the universal translators seen in Star Trek!

  • rate this

    Comment number 52.

    Hope it works better than Google Translate.

  • rate this

    Comment number 51.

    I'm not sure about this. I suspect that it will create a psychological disconnect between interlocutors. It makes me think of those oh-so-very-annoying pre-recorded "press 1 for ... " machines. Even though there is a real person out there somewhere you will be being spoken to by a machine and that will be very obvious and quite possible even more annoying.

  • rate this

    Comment number 50.

    Nice in theory, very difficult in practice.

    Quite a lot of spoken language consists of 'Idioms' with no direct equivalent in other languages, e.g. 'Pulling my leg'.

    Attempts to literally translate idioms just generate gibberish, so whilst this might be OK for simple business transactions, normal social conversations could end up like the 'Monty Python - Dirty Hungarian Phrasebook' sketch ;-)

  • rate this

    Comment number 49.

    Valuable for emergencies since we cannot know dozens of languages.

    However, could there be a time when we need to know nothing? We won't need to know maths, how to use a map, speak another language?

  • rate this

    Comment number 48.

    It will be interesting to see how idiolect and idioms translate!

    Will sound a bit crazy when the word 'like' is repeated between every other word or 'It's raining cats and dogs'!

  • rate this

    Comment number 47.

    Think Google Translate with Windows Narrator function. God help us.

  • rate this

    Comment number 46.

    Star Treks Universal Translator

  • rate this

    Comment number 45.

    It begs the question....

    ...if a Brit talking to someone who speaks whatever other language and the translation doesn't come out well enough for them to understand it will the Brit keep repeating themselves, only louder & with gesticulations, thinking that'll somehow help......

  • rate this

    Comment number 44.

    This is amazing.... Brings to mind Star Wars for me though:

    Lars: "Do you speak Bocce?"
    C-3PO: "Of course I can, sir. It's like a second language to me..."

  • rate this

    Comment number 43.

    So now we can say "hello, hello are you still there" in Japanese and hear "go and stand in the corrdior if you want to use your cellphone on the train" in English... isn't technology great. How about changing the voice to sound like Britney Spears....


Page 4 of 7


More Technology stories



Copyright © 2015 BBC. The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.