Vicomtech at MEDDOCAN: Medical Document Anonymization

Abstract

This paper describes the participation of Vicomtech s team in the MEDDOCAN: Medical Document Anonymization challenge, which consisted in the recognition and classification of protected health information (PHI) in medical documents in Spanish. We tested different state-of-the-art classification algorithms, both deep and shallow, and rich sets of features, obtaining an F1-score of 0.960 in the strictest evaluation. The models submitted and scripts for decoding will be available at https://snlt.vicomtech.org/meddocan2019.

BIB_text

@Article {
title = {Vicomtech at MEDDOCAN: Medical Document Anonymization},
pages = {696-703},
keywds = {
PHI De-identification Textual Anonymisation Machine Learning Spanish Corpus
}
abstract = {

This paper describes the participation of Vicomtech s team in the MEDDOCAN: Medical Document Anonymization challenge, which consisted in the recognition and classification of protected health information (PHI) in medical documents in Spanish. We tested different state-of-the-art classification algorithms, both deep and shallow, and rich sets of features, obtaining an F1-score of 0.960 in the strictest evaluation. The models submitted and scripts for decoding will be available at https://snlt.vicomtech.org/meddocan2019.


}
date = {2019-08-01},
}
Vicomtech

Parque Científico y Tecnológico de Gipuzkoa,
Paseo Mikeletegi 57,
20009 Donostia / San Sebastián (España)

+(34) 943 309 230

close overlay