Long audio alignment for automatic subtitling using different phone-relatedness measures

Egileak: Aitor Alvarez, Haritz Arzelus, Pablo Ruiz

Data: 09.05.2014


PDF

Abstract

In this work, long audio alignment systems for Spanish and English are presented in an automatic subtitling scenario. Pre-recorded contents are automatically recognized at phoneme level by language-dependent phone decoders.  A dynamic-programming  alignment algorithm  finds matches between  the automatically decoded  phones and the ones  in the phonetic transcription for the content’s script.  The accuracy of the  alignment algorithm  is evaluated  when applying  three  non-binary scoring  matrices based on phone confusion-pairs  from  each phone decoder,  on phonological similarity and  on  human perception  errors.  Alignment results  with the three continuous-score  matrices  are compared to results  with  a  baseline  binary matrix,  at word and subtitle levels. The non-binary matrices achieved clearly better results.  Matrix samples  are given in the project’s website.

BIB_text

@Article {
author = {Aitor Alvarez, Haritz Arzelus, Pablo Ruiz},
title = {Long audio alignment for automatic subtitling using different phone-relatedness measures},
pages = {6280-6284},
keywds = {

Long audio alignment,  phonological similarity matrices, perceptual  confusion  matrices, automatic subtitling


}
abstract = {

In this work, long audio alignment systems for Spanish and English are presented in an automatic subtitling scenario. Pre-recorded contents are automatically recognized at phoneme level by language-dependent phone decoders.  A dynamic-programming  alignment algorithm  finds matches between  the automatically decoded  phones and the ones  in the phonetic transcription for the content’s script.  The accuracy of the  alignment algorithm  is evaluated  when applying  three  non-binary scoring  matrices based on phone confusion-pairs  from  each phone decoder,  on phonological similarity and  on  human perception  errors.  Alignment results  with the three continuous-score  matrices  are compared to results  with  a  baseline  binary matrix,  at word and subtitle levels. The non-binary matrices achieved clearly better results.  Matrix samples  are given in the project’s website.


}
isi = {1},
date = {2014-05-09},
year = {2014},
}
Vicomtech

Gipuzkoako Zientzia eta Teknologia Parkea,
Mikeletegi Pasealekua 57,
20009 Donostia / San Sebastián (Espainia)

+(34) 943 309 230

Zorrotzaurreko Erribera 2, Deusto,
48014 Bilbo (Espainia)

close overlay

Jokaeraren araberako publizitateko cookieak beharrezkoak dira eduki hau kargatzeko

Onartu jokaeraren araberako publizitateko cookieak