Today, voice recognition works best for English and other major European languages as virtually all of the related research focuses on these languages. Now, however, Google is making a push to improve its voice search for underrepresented languages by adding Zulu, Afrikaans and South African-accented English.

Google’s researchers Pedro J. Moreno and Johan Schalkwyk note, “that the speech research community needs to start working on many of these underrepresented languages to advance progress and build speech recognition, translation and other Natural Language Processing (NLP) technologies.” To bring this research up to speed, Google collaborated with local researchers to collect audio samples and developed lexicons and grammars for these languages (for English, Google used 411-GOOG to collect samples). You can find more detailed information about Google’s research efforts here.

Google’s official mission is to “organize the world’s information and make it universally accessible and useful.” While this effort obviously fits right into this, Google is not just financing this research out of the goodness of its heart. The insights the company gathers by tackling hard problems like speech recognition for Zulu – which is not widely represented on the Web and is only spoken by about 10 million people – will also help its researchers to improve other aspects of its services and refine its algorithms for more widely used languages.