From Subtitles to Parallel Corpora

Egileak: Mark Fishel and Panayota Georgakopoulou and Sergio Penkale and Volha V. Petukhova and Matej Rojc and Martin Volk and Andy Way

Data: 28.05.2012


PDF

Abstract

We describe the preparation of parallel corpora based on professional quality subtitles in seven European language pairs. The main focus is the effect of the processing steps on the size and quality of the final corpora.

BIB_text

@Article {
author = {Mark Fishel and Panayota Georgakopoulou and Sergio Penkale and Volha V. Petukhova and Matej Rojc and Martin Volk and Andy Way},
title = {From Subtitles to Parallel Corpora},
pages = {3-6},
abstract = {

We describe the preparation of parallel corpora based on professional quality subtitles in seven European language pairs. The main focus is the effect of the processing steps on the size and quality of the final corpora.


}
date = {2012-05-28},
year = {2012},
}
Vicomtech

Gipuzkoako Zientzia eta Teknologia Parkea,
Mikeletegi Pasealekua 57,
20009 Donostia / San Sebastián (Espainia)

+(34) 943 309 230

Zorrotzaurreko Erribera 2, Deusto,
48014 Bilbo (Espainia)

close overlay

Jokaeraren araberako publizitateko cookieak beharrezkoak dira eduki hau kargatzeko

Onartu jokaeraren araberako publizitateko cookieak