Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks

Automatic detection and classification of dynamic hand gestures in real-world systems intended for human computer interaction is challenging as: 1) there is a large diversity in how people perform gestures, making detection and classification difficult; 2) the system must work online in order to avoid noticeable lag between performing a gesture and its classification; in fact, a negative lag (classification even before the gesture is finished) is desirable, as the feedback to the user can then be truly instantaneous. In this paper, we address these challenges with a recurrent three-dimensional convolutional neural network that performs simultaneous detection and classification of dynamic hand gestures from unsegmented multi-modal input streams. We employ connectionist temporal classification to train the network to predict class labels from in-progress gestures in unsegmented input streams. In order to validate our method, we introduce a new challenging multi-modal dynamic hand gesture dataset captured with depth, color and stereo-IR sensors. On this challenging dataset, our gesture recognition system achieves an accuracy of 83.8%, outperforms competing state-of-the-art algorithms, and approaching human accuracy of 88.4%.

Authors

Pavlo Molchanov

Kihwan Kim (NVIDIA)

Publication Date

Wednesday, June 1, 2016

Published in

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016

Research Area

Artificial Intelligence and Machine Learning

Computer Vision

External Links

Supplementary Video (Youtube)

NVIDIA Dynamic Hand Gesture Dataset

Uploaded Files

camera_ready_final.mp490.04 MB

NVIDIA_R3DCNN_cvpr2016.pdf3.27 MB

Copyright

This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org.