Soeren Eberhardt on 30 years in localization, scaling Microsoft’s global infrastructure, and using synthetic data to preserve Mayan languages.
What does it look like when someone spends three decades inside the localization industry, from proofreading translations on a typewriter in Germany, to shipping Windows in dozens of languages at Microsoft, to now working to save languages that have almost no digital footprint at all?
That is the arc of Soeren Eberhardt’s career, and it is one of the most layered conversations we have had on the Localization Fireside Chat.
Soeren holds a master’s degree in comparative literature, linguistics, and philosophy. He fell into localization by accident in the late 1990s when a friend recommended him for a proofreading role, and he never left. He went on to spend over 20 years at Microsoft, working across roles from localizer and project manager to localization engineer, global site manager for Microsoft 365, and partner enablement PM. Along the way he contributed to the global launches of Windows, Skype, and Microsoft Teams, including being part of the very first localization push for Teams when it was still under NDA and linguists had to sign confidentiality agreements just to learn what the product was.
Localization as an afterthought, and what to do about it
One of the sharpest threads in our conversation is the tension between where localization should sit in the product development process and where it actually ends up. Soeren is direct about it: the maturity of an organization’s international strategy shows up clearly in how early or late they bring localization into the room.
At Microsoft, he experienced both ends of that spectrum. Some engineering teams saw localization as a bottleneck. Marketing teams, on the other hand, were eager to involve him early because they understood that the story they were telling had to travel across cultures, not just across languages. His advice is consistent: get localization upstream, involve the right people before the structure is built, and save yourself the costly rework that comes from treating translation as the last step.
Maintaining brand voice across 39 languages
Microsoft localizes into roughly 40 languages across many of its products. Keeping that consistent is not a question of willpower. It is infrastructure. Soeren walked us through what that actually looks like: centralized terminology databases, language-specific style guides, language quality strategies built across teams, and a review process that samples across the full language set rather than assuming everything is fine. The goal is not just grammatically correct output. It is making sure the brand voice and the intent of the message survive the journey into every target language.
The Mayan Languages Preservation Project
This is the part of the conversation that originally drew me to reach out to Soeren. He is currently contributing to the Mayan Languages Preservation Project, with Q’eqchi’ as the pilot language. Q’eqchi’ is spoken by roughly a million people in Guatemala, but it has almost no digital data, which makes training a machine translation engine on it essentially impossible through conventional means.
The approach Soeren and his collaborators are taking is to generate synthetic data using a rule-based sentence generator. The system uses trilingual word lists in Q’eqchi’, Spanish, and English, sentence templates, and a morphologizer that encodes grammatical rules for each of the three languages. The output is thousands of sentence triplets that can be fed into MT training alongside whatever natural data exists. It is not a silver bullet, and Soeren is clear that the engine is not ready for public use yet, but the proof of concept is there and a research paper is in preparation.
What makes this especially compelling is how it connects to his earlier work. Years before the Mayan project, Soeren worked on bringing Windows to Inuktitut, the language of the Inuit in Nunavut, Canada. That experience taught him how much community input matters in terminology decisions, how carefully speakers vet new words for concepts like mouse click or internet, and how fragile the whole process is when you are working with a language that has no existing digital infrastructure to lean on.
On AI, expertise, and asking better questions
Soeren’s take on AI in the localization industry is measured and worth sitting with. He is not dismissive of what the technology can do, but he pushes back on the assumption that better tools reduce the need for expertise. His argument is the opposite: LLMs require you to ask the right questions, and asking the right questions requires you to understand the domain deeply. A PM who has never learned another language can technically work in localization, but the ones who understand how languages differ, how markets behave differently, how cultural nuance resists direct transfer, are the ones who get genuinely good results from these tools.
His line near the end of our conversation stuck with me: philosophy is never about finding the right answer. It is always about asking the best questions. That is a good frame for where the industry is right now.
CLOSING LINKS:
Watch the full episode on YouTube: https://youtu.be/haGEwKT94pk
Listen on Simplecast: https://localization-fireside-chat.simplecast.com/episodes/he-localized-windows-skype-teams-at-microsoft-for-20-years-now-hes-saving-dying-languages
Connect with Soeren Eberhardt on LinkedIn: https://www.linkedin.com/in/soereneberhardt/
Follow the Localization Fireside Chat on LinkedIn: https://www.linkedin.com/company/localization-fireside-chat/
Book a 30-minute call with Robin: https://calendly.com/robin-ayoub/30min
Learn more about N49Networks: https://n49networks.com
Leave a comment