Advertisement
This Article is From Mar 09, 2010

Google translation system shows power, skill

Google translation system shows power, skill
Mountainview, California: In a meeting at Google in 2004, the discussion turned to an e-mail message the company had received from a fan in South Korea. Sergey Brin, a Google founder, ran the message through an automatic translation service that the company had licensed.

The result read: "The sliced raw fish shoes it wishes. Google green onion thing!"

Brin said Google ought to be able to do better. Six years later, its free Google Translate service handles 52 languages, more than any similar system, and people use it hundreds of millions of times a week to translate Web pages and other text.

"What you see on Google Translate is state of the art" in computer translations that are not limited to a particular subject area, said Alon Lavie, an associate research professor in the Language Technologies Institute at Carnegie Mellon University.

Google's efforts to expand beyond searching the Web have met with mixed success. Its digital books project has been hung up in court, and the introduction of its social network, Buzz, raised privacy fears. The pattern suggests that it can sometimes misstep when it tries to challenge business traditions and cultural conventions.

But Google's quick rise to the top echelons of the translation business is a reminder of what can happen when Google unleashes its brute-force computing power on complex problems.

The network of data centers that it built for Web searches may now be, when lashed together, the world's largest computer. Google is using that machine to push the limits on translation technology. Last month, for example, it said it was working to combine its translation tool with image analysis, allowing a person to, say, take a cell phone photo of a menu in German and get an instant English translation.

"Machine translation is one of the best examples that shows Google's strategic vision," said Tim O'Reilly, founder and chief executive of the technology publisher O'Reilly Media. "It is not something that anyone else is taking very seriously. But Google understands something about data that nobody else understands, and it is willing to make the investments necessary to tackle these kinds of complex problems ahead of the market."

Creating a translation machine has long been seen as one of the toughest challenges in artificial intelligence. For decades, computer scientists tried using a rules-based approach - teaching the computer the linguistic rules of two languages and giving it the necessary dictionaries.

But in the mid-1990s, researchers began favoring a so-called statistical approach. They found that if they fed the computer thousands or millions of passages and their human-generated translations, it could learn to make accurate guesses about how to translate new texts.

It turns out that this technique, which requires huge amounts of data and lots of computing horsepower, is right up Google's alley.

"Our infrastructure is very well-suited to this," Vic Gundotra, a vice president for engineering at Google, said. "We can take approaches that others can't even dream of."

Automated translation systems are far from perfect, and even Google's will not put human translators out of a job anytime soon. Experts say it is exceedingly difficult for a computer to break a sentence into parts, then translate and reassemble them. But Google's service is good enough to convey

the essence of a newspaper article, and it has become a quick source for translations for millions of people.

"If you need a rough-and-ready translation, it's the place to go," said Philip Resnik, a machine translation expert and associate professor of linguistics at the University of Maryland, College Park.

Like its rivals in the field, most notably Microsoft and IBM, Google has fed its translation engine with transcripts of U.N. proceedings, which are translated by humans into six languages, and those of the European Parliament, which are translated into 23. This raw material is used to train translation systems for the most common languages.

But Google has scoured the text of the Web, as well as data from its book scanning project and other sources, to move beyond those languages. For more obscure languages, it has released a "translator tool kit" that helps users with translations and then adds those texts to its database.

Google's offering could put a dent in sales of corporate translation software from companies like IBM. But automated translation is never likely to be a big moneymaker, at least not by the standards of Google's advertising business. Still, Google's efforts could pay off in several ways.

Because Google's ads are ubiquitous online, anything that makes it easier for people to use the Web benefits the company. And the system could lead to interesting new applications. Last week, the company said it would use speech recognition to generate captions for English-language YouTube videos, which could then be translated into 50 other languages.

"This technology can make the language barrier go away," said Franz Och, a principal scientist at Google who leads the company's machine translation team. "It would allow anyone to communicate with anyone else."

Track Latest News Live on NDTV.com and get news updates from India and around the world

Follow us: