Linguistic Data Optimization Tool

LIDO

Duration:

01.04.2022 - 31.12.2024

Technologies:

Language Processing

Translation processes are essential for overcoming the linguistic barriers that significantly hinder socioeconomic activities, particularly in multilingual communities such as the European Union or the Basque Autonomous Community. In response to the exponential growth of digital content, translation activities increasingly rely on specialized technologies such as Computer-Assisted Translation (CAT) tools, built on Translation Memories (TM), and Machine Translation (MT).

In the field of MT, recent advances in Artificial Intelligence (AI), especially neural networks and Deep Learning, have significantly accelerated progress. Neural Machine Translation (NMT) has emerged as the new scientific and commercial paradigm and is becoming increasingly integrated into multilingual content production, particularly in professional translation workflows involving post-editing.

To deliver high-quality automatic translations, NMT systems require large parallel linguistic datasets —sets of aligned sentences in two languages— to model translation knowledge across language pairs. These resources must be of high quality, as noise in the data, such as misalignments, corrupted characters, or incorrect encoding, directly impacts system performance. Similarly, errors in translation memories reduce the productivity of human translators. In practice, substantial noise in linguistic corpora and translation memories is widespread and significantly undermines the efficiency and quality of translation processes.

The LIDO project aims to research and develop a multilingual linguistic data optimization system using Artificial Intelligence technologies. The optimization will be approached through three main axes, leveraging dedicated AI models, including neural language models, multilingual vector-based semantic representation models, and highly portable statistical models.

Looking for support for your next project? Contact us, we are looking forward to helping you.

Vicomtech

Parque Científico y Tecnológico de Gipuzkoa,
Paseo Mikeletegi 57,
20009 Donostia / San Sebastián (Spain)

+(34) 943 309 230

Zorrotzaurreko Erribera 2, Deusto,
48014 Bilbao (Spain)

close overlay

Behavioral advertising cookies are necessary to load this content

Accept behavioral advertising cookies