Welsh is the oldest living language in Great Britain — a branch of ancient Celtic that has survived its English neighbour’s best efforts to kill it off. Despite their proximity, the two languages are pretty much mutually unintelligible. And yet for much of the past year OpenAI’s chatbot has insisted on talking to me in Welsh.
If you are a ChatGPT user who hasn’t encountered this problem it may be because you haven’t made the switch from typing prompts to saying them out loud. Once you do it’s hard to go back. AI transcription is generally excellent. It’s just that every so often mine would start spooling out Welsh. “I am not speaking Welsh,” I would say. “Dwi ddim yn siarad Cymraeg,” it would write.
OpenAI’s explanation for why this happened was that Whisper, its speech-to-text model, sometimes got confused. But who confuses English, the world’s most widely used language, for Welsh? Adding to the mystery was the fact that it wasn’t mishearing homophonic words but translating them. A blog for developers suggested a conspiracy (“yet another example of the socialist Welsh government pushing its ideology”). But that was scuppered by users who reported the same thing happening in Malay and Icelandic.