University of Tartu language award is given for contribution to preserving and developing Finno-Ugric languages

Arvutiteaduse instituudi ning eesti ja üldkeeleteaduse instituudi teadlased
Researchers from the Institute of Computer Science and the Institute of Estonian and General Linguistics
Author:
Andres Tennus

The University of Tartu language award 2024 is given for contribution to preserving and developing Finno-Ugric languages through creating a digital translation engine. As a result of cooperation between researchers and developers of the Institute of Computer Science and the Institute of Estonian and General Linguistics, the translation engine Neurotõlge can be used for translating into 30 languages, 23 of which are Finno-Ugric languages.

Besides Estonian, Finnish and Hungarian already available in many translation systems, the Neurotõlge translation engine supports many smaller languages – Livonian, Võro, Proper Karelian, Livvi Karelian, Ludian, Veps, Northern Sami, Southern Sami, Inari Sami, Skolt Sami, Lule Sami, Komi, Komi-Permyak, Udmurt, Hill Mari and Meadow Mari, Erzya, Moksha, Mansi and Khanty. Most of these languages are available for use in a public translation engine for the first time.

The evaluation committee highlighted the valuable work of the Neurotõlge team in supporting international cooperation in the research on small languages. “Several languages included in the translation engine are spoken by a very small community. The digitisation of a language gives hope that it will not disappear but can be studied and learned in the future. Neurotõlge is the first language engine in the world to allow translation into so many small and endangered languages,” said the head of the evaluation committee, the university’s Academic Secretary Tõnis Karki.

To develop Neurotõlge, computer scientists collaborate with researchers from the Institute of Estonian and General Linguistics, who have collected and digitised the rich vocabulary and created corpora of Finno-Ugric languages over decades. Providing machine translation for endangered Finno-Ugric languages will help preserve and study the languages and support their speakers.

More languages to be included in the translation engine

According to the head of the working group, Professor of Natural Language Processing Mark Fišel, six more endangered languages will be added to the list of supported languages in the near future: Izhorian and Vote among languages closely related to Estonian, and also Pite Sami, Kildin Sami, Ume Sami and Kven. “Working with dialects has been particularly time-consuming: in the case of small languages, there are dialects that are more different from each other than, for example, Estonian and Finnish. Creating a high-quality translation based on this material is a very interesting challenge for researchers,” explained Fišel.

The developers of the translation engine encourage the speakers and researchers of Finno-Ugric languages to add and correct translations on www.neurotolge.ee. In the past three months, more than 700 translation improvements have been collected through this co-creation, most of them concerning translations between Northern Sami and Norwegian.

Awardees from the Institute of Computer Science are Professor of Natural Language Processing Mark Fišel, Research Fellow in Natural Language Processing Lisa Yankovskaya, Junior Research Fellows in Natural Language Processing Dmytro Pashchenko, Hele-Andra Kuulmets and Taido Purason, Associate Professor in Natural Language Processing Heiki-Jaan Kaalep, Scientific Programmer Tarmo Vaino, Language Data and Content Creation Specialist Britt-Kathleen Mere, Programmer Aleksei Ivanov and Head of Applied Natural Language Processing Liisa Rätsep. Awardees from the Institute of Estonian and General Linguistics are Lecturer in Digital Linguistics Joshua Wilbur, Associate Professor of Finnic Languages Elena Markus, Associate Professor of Finno-Ugric Languages Fedor Rozhanskiy, Research Fellow in Phonetics of Finnic Languages Tuuli Tuisk and Research Fellow in Livonian Marili Tomingas.

Six candidates were nominated for the University of Tartu language award. The award is given to recognise a university member or working group who has stood out in the previous year by valuing the Estonian language in fulfilling the university’s goals either in giving higher education, doing research or serving society.

Did you find the necessary information? *
Thank you for the feedback!