Long audio alignment for automatic subtitling using different phone-relatedness measures

Authors: Aitor Alvarez, Haritz Arzelus, Pablo Ruiz

Date: 09.05.2014


PDF

Abstract

In this work, long audio alignment systems for Spanish and English are presented in an automatic subtitling scenario. Pre-recorded contents are automatically recognized at phoneme level by language-dependent phone decoders.  A dynamic-programming  alignment algorithm  finds matches between  the automatically decoded  phones and the ones  in the phonetic transcription for the content’s script.  The accuracy of the  alignment algorithm  is evaluated  when applying  three  non-binary scoring  matrices based on phone confusion-pairs  from  each phone decoder,  on phonological similarity and  on  human perception  errors.  Alignment results  with the three continuous-score  matrices  are compared to results  with  a  baseline  binary matrix,  at word and subtitle levels. The non-binary matrices achieved clearly better results.  Matrix samples  are given in the project’s website.

BIB_text

@Article {
author = {Aitor Alvarez, Haritz Arzelus, Pablo Ruiz},
title = {Long audio alignment for automatic subtitling using different phone-relatedness measures},
pages = {6280-6284},
keywds = {

Long audio alignment,  phonological similarity matrices, perceptual  confusion  matrices, automatic subtitling


}
abstract = {

In this work, long audio alignment systems for Spanish and English are presented in an automatic subtitling scenario. Pre-recorded contents are automatically recognized at phoneme level by language-dependent phone decoders.  A dynamic-programming  alignment algorithm  finds matches between  the automatically decoded  phones and the ones  in the phonetic transcription for the content’s script.  The accuracy of the  alignment algorithm  is evaluated  when applying  three  non-binary scoring  matrices based on phone confusion-pairs  from  each phone decoder,  on phonological similarity and  on  human perception  errors.  Alignment results  with the three continuous-score  matrices  are compared to results  with  a  baseline  binary matrix,  at word and subtitle levels. The non-binary matrices achieved clearly better results.  Matrix samples  are given in the project’s website.


}
isi = {1},
date = {2014-05-09},
year = {2014},
}
Vicomtech

Parque Científico y Tecnológico de Gipuzkoa,
Paseo Mikeletegi 57,
20009 Donostia / San Sebastián (Spain)

+(34) 943 309 230

Zorrotzaurreko Erribera 2, Deusto,
48014 Bilbao (Spain)

close overlay

Behavioral advertising cookies are necessary to load this content

Accept behavioral advertising cookies