The Future of Machine Translation

By

Willy Chaplin

November 22, 2007

There is a children's game, called "telephone." The kids sit in a circle and one of them starts the game by whispering a sentence into the ear of the child sitting to his left. Then that youngster repeats what he or she heard to the kid sitting on his left. When the process has been repeated all the way around the circle to the child sitting on the right of the kid who started the game speaks what he "heard" out loud. The result is usually quite humorous.

A similar effect can be obtained using a machine translation by first translating the paragraph above into another language, then translating the translation back into English. We did this using the free translation site of Babelfish on the Web, first translating it into Korean. Here is the final output.

" telephone. " As this the children' which it calls; The s there is a game,; The child inside their circle and the thing sits in his left and the return route elder brother of the child it whispers and start game it sits. That or she sits in his left and that youth of that time repeats the fact that it listens to in the child. In the right sitting child the child where the process starts a game inside inside the middle it will be repeated in circular surroundings, he talks that it listens to in the sound where he is big at the outside. The result does hun and positively it is humorous.

Attempting the same thing using Russian, here is the final product, a little more understandable, but not much.

There will be the caused game children's, "telepyuone" Little-ones sit in the circle and one their starts game by way to whisper to proposal into the ear of child sitting to his left side. That youngster it repeats after this, or it heard to the little-one sitting on his left side. When process is repeated completely around the circle to the child sitting on the right of little-one it began game, he speaks it it heard out of loud. Result usually is sufficiently humorous.

The programmers at Babelfish would probably protest that the test isn’t really fair. The syntax and grammar of these two languages differ quite a bit. The real question isn’t whether this two stage process results in the same output as the original input, but rather whether a Korean speaker would understand the first translation. Besides, the original paragraph is quite complex.

OK, let’s try something simpler. The sentences below might have been written by an American child to a Korean pen pal named Sung.

Hi, Sung. How are you today? We learned some Korean history in school today. It was very interesting.

Here is the result, quite a bit clearer and perhaps quite understandable to Sung.

Goodbye, song it does. U tremble today? We learned a some Korean history today inside the school. Quite fun it was.

Repeating the experiment in Russian, a language with much closer grammar and syntax to English, we get the very plausible outcome below, nearly the same as the input. The English sentences, adapted to Russian, read…

Hi, Vanya. How are you today? We learned some Russian history in school today. It was very interesting.

The recycled output below, is the very close…

Hi, Vanya. How you today? We learned a certain Russian history in the school today. It was very interesting.

These examples were used because this article concerns the future of automatic computer translation. We would like to know what that future might be and why it may be very important. In the opinion of this author, the answer to both questions is the same. That is, the end result of good machine translation will be the melding of all cultures and peoples on the planet, using the internet.

The first examples above show how far that technology has to go before it reaches a point where complicated and challenging thoughts can be translated from one language to another. The second set shows that, for some applications, machine translation is already viable and reasonably accurate.

Recently, using the Web page translation facility offered by Babelfish, we put a button on the home page of our collection of Web sites at http://www.dreamagic.com which lead to a page with similar buttons to translate the original page into any one of 27 different languages, all those offered by Babelfish and Intertran. The really neat thing about this facility is, that after you have translated the home page into another language, all simple links to that page are also translated into the same language. We say “simple link” because links to page generators do not work this way. However, we are certain Babelfish could alter their software to accomplish this end with a couple of simple alterations.

We suspect, but don’t know enough about the various tongues, that the translations, like the first examples above, leave much to be desired. However, it occurred to us that, in line with the second set of examples, that a facility offering real-time translation of email or chat room chatter would be a lot more useful and a lot more accurate. However, it turned out that PERL, the language of choice for quick-and-dirty prototyping of Web based generators, does not handle unicode very well, at least the version hosted by our ISP.

But, alas, there is a second more serious objection to using machine translation for these types of application. As any frequent text-messenger devotee could tell you, the accuracy of spelling in these media is very primitive. As long as the respondent can understand what is being said, “HU CRS F TH SPLNG S GUD?” Furthermore, there are all sorts of new acronyms being spawned and spread around the world to further ease the burden of typing, especially using just your thumbs. The input to current machine translation facilities is not very tolerant at all to these neologisms.

However, suppose that one of the giant “social computing” sites, like MySpace or FaceBook, were to go ahead and erect sites to exchange real-time translations of email and chat? Suppose further that they purchased one of the existing translation corporations and set their programmers to work adapting to the realities of modern communication? Would this not set up a positive gain feedback loop where the translations would get incrementally more accurate day after day, week after week, until a true tower of Babel were constructed on the Net?

AI experts could be put to work making the applications more spelling tolerant. Others could work on cataloging and translating the acronyms. To date, most of the effort in machine translation has been to make the translations more accurate for scholars in the respective languages. Isn’t it time to spend more effort on what people actually want to do and how they actually speak and write? After all, while the Net is certainly very important to scholars, most of the people using it are just regular folks. We think that this will become ever more significant as time goes on and the technology spreads far and wide.

The Net puts each of us theoretically in touch with every other person on the planet. Isn’t it time for that to be transformed from a theory into practice, from a dream into reality?


Willy Chaplin is an AI expert with almost a half century of experience in the field. He was on the Internet in December 1969, one month after it started up as ARPANET, has designed and written almost 2,000,00 lines of computer code and has post-graduate level training in Mathematics, Computer Science, Psychology and Linguistics.