Scholars find that large language models suffer a digital divide: The ChatGPTs and Geminis of the world work well for the 1.52 billion people who speak English, but they underperform for the world’s 97 million Vietnamese speakers, and even worse for the 1.5 million people who speak the Uto-Aztecan language Nahuatl.
The main culprit is data: These non-English languages lack the needed quantity and quality of data to build and train effective models. That means most major LLMs are…
Article Source
https://hai.stanford.edu/news/closing-the-digital-divide-in-ai