Steerable application-adaptive near eye displays

The design challenges of see-through near-eye displays can be mitigated by specializing an augmented reality device for a particular application. We present a novel optical design for augmented reality near-eye displays exploiting 3D stereolithography printing techniques to achieve similar characteristics to progressive prescription binoculars. We propose to manufacture inter-changeable optical components using 3D printing, leading to arbitrary shaped static projection screen surfaces that are adaptive to the targeted applications.

Correlation-Aware Semi-Analytic Visibility for Antialiased Rendering

Geometric aliasing is a persistent challenge for real-time rendering. Hardware multisampling remains limited to 8 × , analytic coverage fails to capture correlated visibility samples, and spatial and temporal postfiltering primarily target edges of superpixel primitives. We describe a novel semi-analytic representation of coverage designed to make progress on geometric antialiasing for subpixel primitives and pixels containing many edges while handling correlated subpixel coverage.

EOE: Expected Overlap Estimation over Unstructured Point Cloud Data

We present an iterative overlap estimation technique to augment existing point cloud registration algorithms that can achieve high performance in difficult real-world situations where large pose displacement and non-overlapping geometry would otherwise cause traditional methods to fail. Our approach estimates overlapping regions through an iterative Expectation Maximization procedure that encodes the sensor field-of-view into the registration process.

Improving Landmark Localization with Semi-Supervised Learning

We present two techniques to improve landmark localization in images from partially annotated datasets. Our primary goal is to leverage the common situation where precise landmark locations are only provided for a small data subset, but where class labels for classification or regression tasks related to the landmarks are more abundantly available. First, we propose the framework of sequential multitasking and explore it here through an architecture for landmark localization where training with class labels acts as an auxiliary signal to guide the landmark localization on unlabeled data.

Learning Superpixels with Segmentation-Aware Affinity Losse

Superpixel segmentation has been widely used in many computer vision tasks. Existing superpixel algorithms are mainly based on hand-crafted features, which often fail to preserve weak object boundaries. In this work, we leverage deep neural networks to facilitate extracting superpixels from images. We show a simple integration of deep features with existing superpixel algorithms does not result in better performance as these features do not model segmentation. Instead, we propose a segmentation-aware affinity learning approach for superpixel segmentation.

Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals

In this paper, we strive to answer two questions: What is the current state of 3D hand pose estimation from depth images? And, what are the next challenges that need to be tackled? Following the successful Hands In the Million Challenge (HIM2017), we investigate the top 10 state-of-the-art methods on three tasks: single frame 3D pose estimation, 3D hand tracking, and hand pose estimation during object interaction. We analyze the performance of different CNN structures with regard to hand shape, joint visibility, view point and articulation distributions.

MoCoGAN: Decomposing Motion and Content for Video Generation

Visual signals in a video can be divided into content and motion. While content specifies which objects are in the video, motion describes their dynamics. Based on this prior, we propose the Motion and Content decomposed Generative Adversarial Network (MoCoGAN) framework for video generation. The proposed framework generates a video by mapping a sequence of random vectors to a sequence of video frames. Each random vector consists of a content part and a motion part. While the content part is kept fixed, the motion part is realized as a stochastic process.

IamNN: Iterative and Adaptive Mobile Neural Network for Efficient Image Classification

Deep residual networks (ResNets) made a recent breakthrough in deep learning. The core idea of ResNets is to have shortcut connections between layers that allow the network to be much deeper while still being easy to optimize avoiding vanishing gradients. These shortcut connections have interesting side-effects that make ResNets behave differently from other typical network architectures. In this work we use these properties to design a network based on a ResNet but with parameter sharing and with adaptive computation time.

Separating Reflection and Transmission Images in the Wild

The reflections caused by common semi-reflectors, such as glass windows, can impact the performance of computer vision algorithms. State-of-the-art methods can remove reflections on synthetic data and in controlled scenarios. However, they are based on strong assumptions and do not generalize well to real-world images. Contrary to a common misconception, real-world images are challenging even when polarization information is used.

Tackling 3D ToF Artifacts Through Learning and the FLAT Dataset

Scene motion, multiple reflections, and sensor noise introduce artifacts in the depth reconstruction performed by time-of-flight (ToF) cameras.
We propose a two-stage, deep-learning approach to address all of these sources of artifacts simultaneously.
We also introduce FLAT, a synthetic dataset of 2000 ToF measurements that capture all of these nonidealities, and allows to simulate different camera hardware. Using the Kinect 2 camera as a baseline, we show improved reconstruction errors over state-of-the-art methods, on both simulated and real data.


Subscribe to Research RSS