Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation

Given two consecutive frames, video interpolation aims at generating intermediate frame(s) to form both spatially and temporally coherent video sequences. While most existing methods focus on single-frame interpolation, we propose an end-to-end convolutional neural network for variable-length multi-frame video interpolation, where the motion interpretation and occlusion reasoning are jointly modeled. We start by computing bi-directional optical flow between the input images using a U-Net architecture.

Charles Loop

Charles Loop is a principal research scientist in visual computing research group at NVIDIA Research. He has spent most of his career as a Research Scientist, working on computer graphics, for companies such as Apple and Microsoft. He is best known as the inventor of Loop Subdivision, an algorithm used for creating smooth shapes used in areas such as medical imaging, special effects, and video games. Charles has been programming Graphics Processing Units (GPUs) for many years. His work appears in many academic publications and applications for digital content creation, such as font and surface rendering; including Pixar's Open SubDiv library.  More recently, he initiated a project at Microsoft called Holoportation, that was well received in computer vision and graphics communities. The project demonstrated a two-way telepresence system, allowing users wearing Augmented Reality (AR) display devices such as Hololens to interact with each other’s photo-realistic 3D holograms in real-time, while physically separated by thousands of miles. Most recently, Charles was Chief Scientist in 8i, a startup whose mission is to deliver photo-realistic human holograms. Charles holds an M.S. in Mathematics from the University of Utah, and a Ph.D. in Computer Science from the University of Washington. Charles is located in Redmond.




HGMR: Hierarchical Gaussian Mixtures for Adaptive 3D Registration

Point cloud registration sits at the core of many important and challenging 3D perception problems including autonomous navigation, SLAM, object/scene recognition, and augmented reality. In this paper, we present a new registration algorithm that is able to achieve state-of-the-art speed and accuracy through its use of a hierarchical Gaussian Mixture Model (GMM) representation. Our method constructs a top-down multi-scale representation of point cloud data by recursively running many small-scale data likelihood segmentations in parallel on a GPU.

Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation

Estimation of 3D motion in a dynamic scene from a temporal pair of images is a core task in many scene understanding problems. In real world applications, a dynamic scene is commonly captured by a moving camera (i.e., panning, tilting or hand-held), increasing the task complexity because the scene is observed from different view points. The main challenge is the disambiguation of the camera motion from scene motion, which becomes more difficult as the amount of rigidity observed decreases, even with successful estimation of 2D image correspondences.

Making Convolutional Networks Recurrent for Visual Sequence Learning

Recurrent neural networks (RNNs) have emerged as a powerful model for a broad range of machine learning problems that involve sequential data. While an abundance of work exists to understand and improve RNNs in the con- text of language and audio signals such as language modeling and speech recognition, relatively little attention has been paid to analyze or modify RNNs for visual sequences, which by nature have distinct properties. In this paper, we aim to bridge this gap and present the first large-scale exploration of RNNs for visual sequence learning.

Adaptive Temporal Antialiasing

We introduce a pragmatic algorithm for real-time adaptive supersampling in games. It extends temporal antialiasing of rasterized images with adaptive ray tracing, and conforms to the constraints of a commercial game engine and today's GPU ray tracing APIs. The algorithm removes blurring and ghosting artifacts associated with standard temporal antialiasing and achieves quality approaching 8X supersampling of geometry, shading, and materials while staying within the 33ms frame budget required of most games.


Subscribe to Research RSS