Video-to-Video Synthesis

We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e.g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video. While its image counterpart, the image-to-image synthesis problem, is a popular topic, the video-to-video synthesis problem is less explored in the literature.

Machine Learning and Rendering

Machine learning techniques just recently enabled dramatic improvements in both realtime and offline rendering. In this course, we introduce the basic principles of machine learning and review their relations to rendering. Besides fundamental facts like the mathematical identity of reinforcement learning and the rendering equation, we cover efficient and surprisingly elegant solutions to light transport simulation, participating media, noise removal, and anti-aliasing.

Gal Chechik

Gal Chechik is a Sr. Director of AI research, leading NVIDIA research in Israel.

Gal is also a Professor of computer science at Bar-Ilan University. Before joining NVIDIA, he was a Staff Research Scientist at Google, a postdoctoral research associate at Stanford University, and received his PhD from the Hebrew University of Jerusalem. Gal published ~140 papers, including publications in Nature Biotechnology, Cell and PNAS, and holds 50 issued patents. His work won awards for outstanding papers at NeurIPS and ICML. 

Jan Novák

Jan is a senior research scientist working on topics at the cross-section of rendering and machine learning. He joined NVIDIA Research in 2018 after five years at Disney Resarch where he led the Rendering Group and worked on various techniques for efficient rendering and physically based simulation of light transport. He initiated and led several projects employing deep learning for image denoising, radiance estimation, and importance sampling. He also developed a number of techniques for rendering scenes with participating media.

Statistical Nearest Neighbors for Image Denoising

Non-local-means image denoising is based on processing a set of neighbors for a given reference patch. few nearest neighbors (NN) can be used to limit the computational burden of the algorithm. Resorting to a toy problem, we show analytically that sampling neighbors with the NN approach introduces a bias in the denoised patch. We propose a different neighbors’ collection criterion to alleviate this issue, which we name statistical NN (SNN). Our approach outperforms the traditional one in case of both white and colored noise: fewer SNNs can be used to generate images of superior quality, at a lower computational cost. A detailed investigation of our toy problem explains the differences between NN and SNN from a grounded point of view. The intuition behind SNN is quite general, and it leads to image quality improvement also in the case of bilateral filtering. The MATLAB code to replicate the results presented in the paper is freely available at https://github.com/NVlabs/SNN.

Kayotee: A Fault Injection-based System to Assess the Safety and Reliability of Autonomous Vehicles to Faults and Errors

Fully autonomous vehicles (AVs), i.e., AVs with autonomy level 5, are expected to dominate road transportation in the near future and contribute trillions of dollars to the global economy. The general public, government organizations, and manufacturers all have significant concern regarding resiliency and safety standards of the autonomous driving system (ADS) of AVs.

Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects

Using synthetic data for training deep neural networks for robotic manipulation holds the promise of an almost unlimited amount of pre-labeled training data, generated safely out of harm's way. One of the key challenges of synthetic data, to date, has been to bridge the so-called \emph{reality gap}, so that networks trained on synthetic data operate correctly when exposed to real-world data. We explore the reality gap in the context of 6-DoF pose estimation of known objects from a single RGB image.

Superpixel Sampling Networks

Superpixels provide an efficient low/mid-level representation of image data, which greatly reduces the number of image primitives for subsequent vision tasks. While various superpixel computation models exist, they are not differentiable, making them difficult to integrate into otherwise end-to-end trainable deep neural networks. In this work, we develop a new differentiable model for superpixel sampling that better leverages deep networks for learning superpixel segmentation.