These technologies facilitate a natural interaction between a person and a computer and the methodologies provide the concepts, techniques and tools for speech processing via digital processing of the signal. Thanks to recent advances in Artificial Intelligence, the practical application of technologies –such as dialogue systems or speech recognition and synthesis in multiple sectors– is increasingly feasible, improving Human Computer Interaction or the processing and use of digital content in multiple languages, including Basque.

Dialogue Systems, Chatbots and Digital Assistants

Intelligent digital assistants are one of the most disruptive and enabling technologies in the new generation of solutions based on Artificial Intelligence. Thanks to Deep Learning and algorithms based on stochastic processes, they are able, among other things, to understand the user’s needs, extract their profile and generate recommendations taking the context into account. Given the transversal nature of the conversational voice assistants, they can be adapted to multiple domains (medical, administrative, commercial, business, industrial, etc.), creating smart interfaces allowing a more natural, direct and intuitive interaction with technology.

Automatic Transcription and Subtitling

Our team is highly specialised scientifically and has multiple cases of real transference and international experience at the highest level in technologies for enriched transcription and automatic subtitling of video and audio in multiple languages and operational modes (offline and online), technology based on constantly evolving proprietary Transkit library. These technological assets based on Deep Learning techniques have been applied in several scenarios with high technological challenges, such as telephone conversations, television content, public transparency portals, parliamentary sessions, meeting transcriptions, security environments, etc.

Voice Synthesis, Vocal Biometrics, Emotions, etc.

In Dialogue and Speech there are other technological assets allowing the development of relevant applications for the identified sectors. As in our speech recognition systems, the End-to-End architectures of our speech synthesisers allow us to generate natural and expressive synthetic voices in multiple languages or recognise emotions through speech. Also, out BioVoice library incorporates the functionalities to train biometric voice systems to recognise or verify the identity of a speaker.

Success Story

Resivoz. Spoken recording of information using conversation assistants

CASER Residencial

discover the story

Publications
Noteworthy Projects

2024-11-11

The Vicomtech Speech Transcription Systems for the Albayzín 2024 Bilingual Basque-Spanish Speech to Text (BBS-S2T) Challenge

2024-10-01

Real-Time Speech-Driven Avatar Animation by Predicting Facial landmarks and Deformation Blendshapes

Aritz Lasarguren Jone López Egoitz Rodríguez

2024-09-18

Incremental Learning for Knowledge-Grounded Dialogue Systems in Industrial Scenarios

Izaskun Fernández Cristina Aceta Cristina Fernández María Inés Torres Aitor Etxalar Joseba Agirre Egoitz Artetxe Iker Altuna

2024-09-09

Anonymizing Dysarthric Speech: Investigating the Effects of Voice Conversion on Pathological Information Preservation

Abner Hernández Paula Andrea Pérez Tomás Arias Seung Hee Yang Juan Rafael Orozco Andreas Maier

2024-09-09

Stream-based Active Learning for Speech Emotion Recognition via Hybrid Data Selection and Continuous Learning

2024-09-01

Exploring Self-supervised Embeddings and Synthetic Data Augmentation for Robust Audio Deepfake Detection

Eros Rosello Ángel M. Gómez Antonio M. Peinado

IRAZ

AI based easy-to-read system

COGILE

COGILE places special emphasis on the human factor as a differentiating vector in the factory of the future and as a support for the 4.0 worker in his or her phases of working life

SHAPES

Throughout Europe, many people are handicapped by reduced capabilities that are either permanent or temporary

CAPTAIN

Living in a familiar home environment is crucial for the well-being of older individuals, particularly when they experience memory loss. While current technologies have the potential to greatly assist elderly persons living alone, there is still a lack of high-tech solutions specifically tailored to meet their unique needs. The EU-funded CAPTAIN project aims to address this gap by developing the first universal assistance system for older adults. This system seeks to compensate for their physical and memory deficiencies in daily living by introducing a groundbreaking Human-Computer Interface appliance that utilises micro-projectors and projected augmented reality. By employing smart home appliances, the project aims to transform a home into an interactive interface, making it easy for older people to access support.

MULTIBIO

Sistema Biométrico Multimodal para Autenticación continua de Usuarios Online

AdapTA

Fully customised machine translation through multimodal data exploitation

Looking for support for your next project? Contact us, we are looking forward to helping you.

Dialogue & Speech

Dialogue Systems, Chatbots and Digital Assistants

Automatic Transcription and Subtitling

Voice Synthesis, Vocal Biometrics, Emotions, etc.

Success Story

Digital Media & Communications

Digital Platforms & Data Spaces