Hand Gesture Recognition with 3D Convolutional Neural Networks

Touchless hand gesture recognition systems are becoming important in automotive user interfaces as they improve safety and comfort. Various computer vision algorithms have employed color and depth cameras for hand gesture recognition, but robust classification of gestures from different subjects performed under widely varying lighting conditions is still challenging.

Short-Range FMCW Monopulse Radar for Hand-Gesture Sensing

Intelligent driver assistance systems have become important in the automotive industry. One key element of such systems is a smart user interface that tracks and recognizes drivers' hand gestures. %Various computer vision systems using optical and depth sensors have been introduced for sensing (tracking/recognition) dynamic gestures. Hand gesture sensing using traditional computer vision techniques is challenging because of wide variations in lighting conditions, e.g., inside a car.

Multi-sensor System for Driver’s Hand-Gesture Recognition

We propose a novel multi-sensor system for accurate and power-efficient dynamic car-driver hand-gesture recognition, using a short-range radar, a color camera, and a depth camera, which together make the system robust against variable lighting conditions. We present a procedure to jointly calibrate the radar and depth sensors. We employ convolutional deep neural networks to fuse data from multiple sensors and to classify the gestures. Our algorithm accurately recognizes 10 different gestures acquired indoors and outdoors in a car during the day and at night.

Slim near eye display using pinhole aperture arrays

We report a new technique for building a wide-angle, lightweight, thin form factor, cost effective, easy to manufacture near-eye Head-Mounted Display (HMD) for virtual reality applications. Our approach adopts an aperture mask containing an array of pinholes and a screen as a source of imagery. We demonstrate proof-of-concept HMD prototypes with a binocular field of view (FOV) of 70◦ × 45◦, or total diagonal FOV of 83◦. This FOV should increase with the increasing display panel size.

Adaptive Segmentation based on a Learned Quality Metric

We introduce a model to evaluate the segmentation quality of a color image. The model parameters were learned from a set of examples. To this aim, we first segmented a set of images using a traditional graph-cut algorithm, for different values of the scale parameter. A human observer classified these images into three classes: under-, well- and over-segmented. We used such classification to learn the parameters of the segmentation quality model, that was then employed to automatically and adaptively optimize the scale parameter of the graph-cut segmentation algorithm.

Machine Learning for Adaptive Bilateral Filtering

We describe a supervised learning procedure for estimating the relation between a set of local image features and the local optimal parameters of an adaptive bilateral filter. A set of two entropy-based features is used to represent the properties of the image at a local scale. Experimental results show that our entropy-based adaptive bilateral fi lter outperforms other extensions of the bilateral lter where parameter tuning is based on empirical rules.

Preconditioned Block-Iterative Methods on GPUs

An implementation of the incomplete-LU/Cholesky preconditioned block-iterative methods on the Graphics Processing Units (GPUs) using the CUDA parallel programming model is presented. In particular, we focus on the tradeoffs associated with the sparse matrix-vector multiplication with multiple vectors, sparse triangular solve with multiple right-hand-sides (rhs) as well as incomplete factorization with 0 fill-in. We use these building blocks to implement the block-CG and block-BiCGStab iterative methods for the symmetric positive definite (s.p.d.) and nonsymmetric linear systems, respectively.

Frustum-Traced Raster Shadows: Revisiting Irregular Z-Buffers

We present a real-time system that renders antialiased hard shadows using irregular z-buffers (IZBs). For subpixel accuracy, we use 32 samples per pixel at roughly twice the cost of a single sample. Our system remains interactive on a variety of game assets and CAD models while running at 1080p and 2160p and imposes no constraints on light, camera or geometry, allowing fully dynamic scenes without precomputation. Unlike shadow maps we introduce no spatial or temporal aliasing, smoothly animating even subpixel shadows from grass or wires.

cuDNN: Efficient Primitives for Deep Learning

We present a library of efficient implementations of deep learning primitives. Deep learning workloads are computationally intensive, and optimizing their kernels is difficult and time-consuming. As parallel architectures evolve, kernels must be reoptimized, which makes maintaining codebases difficult over time. Similar issues have long been addressed in the HPC community by libraries such as the Basic Linear Algebra Subroutines (BLAS). However, there is no analogous library for deep learning.

Aggregate G-Buffer Anti-Aliasing

We present Aggregate G-Buffer Anti-Aliasing (AGAA), a new technique for efficient anti-aliased deferred rendering of complex geometry using modern graphics hardware. In geometrically complex situations, where many surfaces intersect a pixel, current rendering systems shade each contributing surface at least once per pixel. As the sample density and geometric complexity increase, the shading cost becomes prohibitive for real-time rendering. Under deferred shading, so does the required framebuffer memory.