Ontolabeling: Re-Thinking Data Labeling For Computer Vision

Autores: Nicola Croce Marcos Nieto Doncel

Fecha: 14.12.2021


Abstract

Over the last decade, developments in computer vision tasks have been driven by image, video, and multimodal benchmark datasets fueling the growth of machine learning methods for object detection, classification, and scene understanding. Such advances have, however, created static, goal-specific and heterogeneous datasets, with little to none emphasis on the used taxonomies and semantics behind the class definitions, making them ill-defined, and hardly mappable to each others. This approach hinders and limits the long-term usability of datasets, their intercompatibility, extensibility, and the ability to repurpose them. In this work we propose a new methodology for data labeling, which we call Ontolabeling, that detaches data structure from semantics, creating two data model layers. The first layer organizes spatio-temporal labels for multi-sensor data, while the second layer makes use of ontologies to structure, organize, maintain, extend and repurpose the semantics of the annotations. Our approach is supported by an open source toolkit that enables label management (create, read, update, and delete) following the proposed Ontolabeling principles.

BIB_text

@Article {
title = {Ontolabeling: Re-Thinking Data Labeling For Computer Vision},
keywds = {
labeling, semantics, VCD, OpenLABEL, AGO
}
abstract = {

Over the last decade, developments in computer vision tasks have been driven by image, video, and multimodal benchmark datasets fueling the growth of machine learning methods for object detection, classification, and scene understanding. Such advances have, however, created static, goal-specific and heterogeneous datasets, with little to none emphasis on the used taxonomies and semantics behind the class definitions, making them ill-defined, and hardly mappable to each others. This approach hinders and limits the long-term usability of datasets, their intercompatibility, extensibility, and the ability to repurpose them. In this work we propose a new methodology for data labeling, which we call Ontolabeling, that detaches data structure from semantics, creating two data model layers. The first layer organizes spatio-temporal labels for multi-sensor data, while the second layer makes use of ontologies to structure, organize, maintain, extend and repurpose the semantics of the annotations. Our approach is supported by an open source toolkit that enables label management (create, read, update, and delete) following the proposed Ontolabeling principles.


}
date = {2021-12-14},
}
Vicomtech

Parque Científico y Tecnológico de Gipuzkoa,
Paseo Mikeletegi 57,
20009 Donostia / San Sebastián (España)

+(34) 943 309 230

Edificio Ensanche,
Zabalgune Plaza 11,
48009 Bilbao (España)

close overlay