Google’s 0-shot neural machine translation system shows intriguing evidence of an interlingua

In recent research (full paper also available), researchers from the Google Brain and Google Translate teams have shown intriguing evidence of a so-called interlingua, that is, a language-agnostic common representation of sentences with the same meaning from different languages.

What I also found interesting about this work (and related to the above finding), is that they’re able to perform translations between language pairs that the system has never trained on.

A further pleasant surprise was seeing how they used the t-SNE visualization technique to embed the high-dimensionally represented sentences in 2D, in order to study the interlingua phenomenon.