1. [Publications](/publications)
2. Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks
 
 # Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks 

  ![](/sites/default/files/styles/wide/public/pubs/2016-06_Online-Detection-and/architecture.png?itok=RWwjJAzP)

 Automatic detection and classification of dynamic hand gestures in real-world systems intended for human computer interaction is challenging as: 1) there is a large diversity in how people perform gestures, making detection and classification difficult; 2) the system must work online in order to avoid noticeable lag between performing a gesture and its classification; in fact, a negative lag (classification even before the gesture is finished) is desirable, as the feedback to the user can then be truly instantaneous. In this paper, we address these challenges with a recurrent three-dimensional convolutional neural network that performs simultaneous detection and classification of dynamic hand gestures from unsegmented multi-modal input streams. We employ connectionist temporal classification to train the network to predict class labels from in-progress gestures in unsegmented input streams. In order to validate our method, we introduce a new challenging multi-modal dynamic hand gesture dataset captured with depth, color and stereo-IR sensors. On this challenging dataset, our gesture recognition system achieves an accuracy of 83.8%, outperforms competing state-of-the-art algorithms, and approaching human accuracy of 88.4%.



 ## Authors



[Pavlo Molchanov](/person/pavlo-molchanov)

[Xiaodong Yang](/person/xiaodong-yang)

[Shaline Gupta](/person/shalini-de-mello)

Kihwan Kim (NVIDIA)

[Stephen Tyree](/person/stephen-tyree)

[Jan Kautz](/person/jan-kautz)

 

 

 ## Publication Date



Wednesday, June 1, 2016

 

 ## Published in



[IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016](http://cvpr2016.thecvf.com/)

 

 ## Research Area



[Artificial Intelligence and Machine Learning ](/research-area/machine-learning-artificial-intelligence)

[Computer Vision](/research-area/computer-vision)

 

 

 ## External Links



[Supplementary Video (Youtube)](https://www.youtube.com/watch?v=NJmk1DUyyB8)

[NVIDIA Dynamic Hand Gesture Dataset](https://drive.google.com/drive/folders/0ByhYoRYACz9cMUk0QkRRMHM3enc?resourcekey=0-cJe9M3PZy2qCbfGmgpFrHQ&usp=sharing)

 

 

 ## Uploaded Files



[camera\_ready\_final.mp4](https://research.nvidia.com/sites/default/files/pubs/2016-06_Online-Detection-and/camera_ready_final.mp4 "Open video in new window")90.04 MB

[NVIDIA\_R3DCNN\_cvpr2016.pdf](https://research.nvidia.com/sites/default/files/pubs/2016-06_Online-Detection-and/NVIDIA_R3DCNN_cvpr2016.pdf "Open file in new window")3.27 MB

 

 

 ## Copyright



This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to <pubs-permissions@ieee.org>.