Ethan has provided me with a mid-point summary of results, which I’ve included below. I was surprised to find that Microsoft and Babelfish are beating Google on some languages pairs, as well as on shorter text strings. Although Google is emerging the overall winner — and receiving some much-deserved attention from the media — it’s nice to see some healthy competition.
That said, quality is only one piece of the puzzle. The other piece — perhaps much more important — is usability. Now that Google has embedded its MT engine into Gmail and Reader — and now its Chrome client –I find I’m using Google exclusively as my MT engine.
Here are Ethan’s findings so far (emphasis mine):
At the highest level, it appears that survey participants prefer Google Translate’s results across the board.
In a few languages (Arabic, Polish, Dutch) the preference is overwhelming with votes for Google doubling its nearest competitor
However, once you remove voters that have self defined their fluency in the source or target language as “limited, ” the contest becomes closer along some of the heavily trafficked languages. For example:
- Microsoft Bing Translator leads in German
- Yahoo! Babelfish leads in Chinese
- Google maintains its lead in Spanish, Japanese, and French
Observing only the self-defined “limited fluency” voter reveals a strong brand bias. If your fluency in the target translation language is limited, it would stand to reason your ability to assess the quality of the translation is very limited. And yet…
- Limited-fluency voters chose Google over Bing by 2 to 1
- They also chose Google over Yahoo! Babelfish by 5 to 1
As I had guessed, Yahoo! and Microsoft’s hybrid rules-based MT model performed better on shorter text passages
For phrases below 50 characters, Google’s lead in Spanish, Japanese, and French disappear. And Microsoft’s lead in German widens.
Beyond 50 characters, Google’s relative performance seems to improve across the board.
For passages that are only one sentence, the same effect is seen, though to a lesser extent than under 50 characters.
On March 4th, we made a few changes to our survey – hiding the brands and randomizing the positions of the text results before voting. Since then, we have not yet collected enough data to draw conclusions, but Babelfish seems to be receiving the biggest boost, perhaps showing the effects of the recent neglect of that tool.