Monday, November 17, 2014

A variant of English optimized for machine translation to Chinese

Jonathon Duerig comment on reminded me of an old idea that still hasn't come to pass.

It should be possible to define a style of English expression, including vocabulary and grammar, that lends itself to machine translation, particularly translation to Chinese. It would likely resemble the language we are learning to use with Siri. We can call it TransEnglish.

With the technology we now use to correct grammar and spelling, and predict text entry, we could do dynamic rewriting -- identifying or rewriting ambiguous phrases, substituting more specific words, revising complex tense structures.

A test harness would machine translate to Chinese then back again, then identify and close divergences. A nice exercise for neural network tools.

I expect after a few weeks of use an English writer would quickly learn how to write translatable text naturally.

Similarly, I'd love to follow some Chinese authored blogs that were written in a similar machine-translation friendly flavor of Chinese - TransChinese. (Indeed, the resulting English output might well be of the same form as TransEnglish.)

No comments: