Multilingual Opinion Mining

< Volver

Autor: Aitor García Pablos

Directores: Montserrat Cuadros Oller (Vicomtech) Germán Rigau (Universidad)

Universidad: UPV/EHU

Fecha: 11.07.2017

Lugar: Donostia-San Sebastián

Every day a lot of text is generated in different online media. Much of this text contains opinions about a multitude of entities, products, services, etc. Given the growing need for automated means to analyse, process and exploit this information, sentiment analysis techniques have received a great deal of attention from industry and the scientific community over the past decade and a half. However, many of the techniques used often require supervised training using manually annotated examples, or other language resources related to a specific language or application domain. This limits the application of these types of techniques, since these resources and training examples are not easy to obtain. This thesis explores a series of methods for performing various automatic text analyses in the context of sentiment analysis, including the automatic extraction of terms of a domain, words expressing opinions, the polarity of the sentiment of those words (positive or negative), etc. Finally, a method combining continuous word embeddings and topic-modelling, inspired by the Latent Dirichlet Allocation (LDA) technique, is proposed and evaluated to obtain an aspect-based sentiment analysis system (ABSA) which only needs a few seed words to process texts from a given language or domain. In this way, the adaptation to another language or domain is reduced to the translation of the corresponding seed words.