Ontolabeling: Re-Thinking Data Labeling For Computer Vision

Egileak: Nicola Croce Marcos Nieto Doncel

Data: 14.12.2021


Abstract

Over the last decade, developments in computer vision tasks have been driven by image, video, and multimodal benchmark datasets fueling the growth of machine learning methods for object detection, classification, and scene understanding. Such advances have, however, created static, goal-specific and heterogeneous datasets, with little to none emphasis on the used taxonomies and semantics behind the class definitions, making them ill-defined, and hardly mappable to each others. This approach hinders and limits the long-term usability of datasets, their intercompatibility, extensibility, and the ability to repurpose them. In this work we propose a new methodology for data labeling, which we call Ontolabeling, that detaches data structure from semantics, creating two data model layers. The first layer organizes spatio-temporal labels for multi-sensor data, while the second layer makes use of ontologies to structure, organize, maintain, extend and repurpose the semantics of the annotations. Our approach is supported by an open source toolkit that enables label management (create, read, update, and delete) following the proposed Ontolabeling principles.

BIB_text

@Article {
title = {Ontolabeling: Re-Thinking Data Labeling For Computer Vision},
keywds = {
labeling, semantics, VCD, OpenLABEL, AGO
}
abstract = {

Over the last decade, developments in computer vision tasks have been driven by image, video, and multimodal benchmark datasets fueling the growth of machine learning methods for object detection, classification, and scene understanding. Such advances have, however, created static, goal-specific and heterogeneous datasets, with little to none emphasis on the used taxonomies and semantics behind the class definitions, making them ill-defined, and hardly mappable to each others. This approach hinders and limits the long-term usability of datasets, their intercompatibility, extensibility, and the ability to repurpose them. In this work we propose a new methodology for data labeling, which we call Ontolabeling, that detaches data structure from semantics, creating two data model layers. The first layer organizes spatio-temporal labels for multi-sensor data, while the second layer makes use of ontologies to structure, organize, maintain, extend and repurpose the semantics of the annotations. Our approach is supported by an open source toolkit that enables label management (create, read, update, and delete) following the proposed Ontolabeling principles.


}
date = {2021-12-14},
}
Vicomtech

Gipuzkoako Zientzia eta Teknologia Parkea,
Mikeletegi Pasealekua 57,
20009 Donostia / San Sebasti√°n (Espainia)

+(34) 943 309 230

Ensanche eraikina,
Zabalgune Plaza 11,
48009 Bilbo (Espainia)

close overlay