Abstract

We present a system to normalize Spanish tweets, which uses preprocessing rules, a domain-appropriate edit-distance model, and language models to select correction candidates based on context. The system’s results at SEPLN 2013 Tweet-Norm task were above-average.

BIB_text

@Article {
author = {Pablo Ruiz, Montse Cuadros, Thierry Etchegoyhen},
title = {Lexical normalization of Spanish tweets with preprocessing rules, domain-specific edit-distances, and language models},
number = {9},
keywds = {

microtexto, español, castellano, normalización léxica, Twitter, distancia de edición, modelo de lengua, Spanish microtext, lexical normalization, Twitter, edit distance, language model

}
abstract = {

}
isbn = {978-84-695-8349-4},
date = {2013-09-12},
year = {2013},
}