Efficient multi-task based facial landmark and gesture detection in monocular images

Abstract

The communication between persons includes several channels to exchange information between individuals. The non-verbal communication contains valuable information about the context of the conversation and it is a key element to understand the entire interaction. The facial expressions are a representative example of this kind of non-verbal communication and a valuable element to improve human-machine interaction interfaces. Using images captured by a monocular camera, automatic facial analysis systems can extract facial expressions to improve human-machine interactions. However, there are several technical factors to consider, including possible computational limitations (e.g. autonomous robots), or data throughput (e.g. centralized computation server). Considering the possible limitations, this work presents an efficient method to detect a set of 68 facial feature points and a set of key facial gestures at the same time. The output of this method includes valuable information to understand the context of communication and improve the response of automatic human-machine interaction systems.

BIB_text

@Article {
title = {Efficient multi-task based facial landmark and gesture detection in monocular images},
pages = {680-687},
keywds = {
Facial feature point detection; Gesture recognition; Multi-task learning
}
abstract = {

The communication between persons includes several channels to exchange information between individuals. The non-verbal communication contains valuable information about the context of the conversation and it is a key element to understand the entire interaction. The facial expressions are a representative example of this kind of non-verbal communication and a valuable element to improve human-machine interaction interfaces. Using images captured by a monocular camera, automatic facial analysis systems can extract facial expressions to improve human-machine interactions. However, there are several technical factors to consider, including possible computational limitations (e.g. autonomous robots), or data throughput (e.g. centralized computation server). Considering the possible limitations, this work presents an efficient method to detect a set of 68 facial feature points and a set of key facial gestures at the same time. The output of this method includes valuable information to understand the context of communication and improve the response of automatic human-machine interaction systems.


}
isbn = {978-989758488-6},
date = {2021-02-08},
}
Vicomtech

Gipuzkoako Zientzia eta Teknologia Parkea,
Mikeletegi Pasealekua 57,
20009 Donostia / San Sebasti√°n (Espainia)

+(34) 943 309 230

Ensanche eraikina,
Zabalgune Plaza 11,
48009 Bilbo (Espainia)

close overlay