Yashraj Narang

I am a robotics research manager at NVIDIA Research. I lead the Simulation and Behavior Generation (SBG) team within the Seattle Robotics Lab (SRL), which is directed by Dieter Fox. My team currently focuses on learned simulators, automated data generation, reinforcement learning, imitation learning, force and tactile sensing, high-performance control, and sim-to-real transfer. We are exploring applications to multi-step tabletop manipulation, robotic assembly, dexterous manipulation with anthropomorphic hands, and humanoids.

Interactive Stable Ray Tracing

Interactive ray tracing applications running on commodity hardware can suffer from objectionable temporal artifacts due to a low sample count. We introduce stable ray tracing, a technique that improves temporal stability without the over-blurring and ghosting artifacts typical of temporal post-processing filters.

Light-Weight Protocols for Wire-Speed Ordering

We describe light-weight protocols for selective packet ordering in out-of-order networks that carry memory traffic.

Exploiting Idle Resources in a High-Radix Switch for Supplemental Storage

A general-purpose switch for a high-performance network is usually designed with symmetric ports providing credit-based flow control and error recovery via link-level retransmission. Because port buffers must be sized for the longest links and modern asymmetric network topologies have a wide range of link lengths, we observe that there can be a significant amount of unused buffer memory, particularly in edge switches. We also observe that the tiled architecture used in many high-radix switches contains an abundance of internal bandwidth.

Phantom Ray-Hair Intersector

We present a new approach to ray tracing swept volumes along trajectories defined by cubic Bézier curves. It performs at two-thirds of the speed of ray-triangle intersection, allowing essentially even treatment of such primitives in ray tracing applications that require hair, fur, or yarn rendering.

 

FocusAR: Auto-focus Augmented Reality Eyeglasses for both Real World and Virtual Imagery

We describe a system which corrects dynamically for the focus of the real world surrounding the near-eye display of the user and simultaneously the internal display for augmented synthetic imagery, with an aim of completely replacing the user prescription eyeglasses. The ability to adjust focus for both real and virtual stimuli will be useful for a wide variety of users, but especially for users over 40 years of age who have limited accommodation range.

A Closed-form Solution to Photorealistic Image Stylization

Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic. While several photorealistic image stylization methods exist, they tend to generate spatially inconsistent stylizations with noticeable artifacts. In this paper, we propose a method to address these issues. The proposed method consists of a stylization step and a smoothing step.

Multimodal Unsupervised Image-to-Image Translation

Unsupervised image-to-image translation is an important and challenging problem in computer vision. Given an image in the source domain, the goal is to learn the conditional distribution of corresponding images in the target domain, without seeing any pairs of corresponding images. While this conditional distribution is inherently multimodal, existing approaches make an overly simplified assumption, modeling it as a deterministic one-to-one mapping. As a result, they fail to generate diverse outputs from a given source domain image. To address this limitation, we propose a Multimodal Unsupervised Image-to-image Translation (MUNIT) framework. We assume that the image representation can be decomposed into a content code that is domain-invariant, and a style code that captures domain-specific properties. To translate an image to another domain, we recombine its content code with a random style code sampled from the style space of the target domain. We analyze the proposed framework and establish several theoretical results. Extensive experiments with comparisons to the state-of-the-art approaches further demonstrates the advantage of the proposed framework. Moreover, our framework allows users to control the style of translation outputs by providing an example style image.

Localization-Aware Active Learning for Object Detection

Active learning - a class of algorithms that iteratively searches for the most informative samples to include in a training dataset - has been shown to be effective at annotating data for image classification. However, the use of active learning for object detection is still largely unexplored as determining informativeness of an object-location hypothesis is more difficult.

Context-aware Synthesis and Placement of Object Instances

Learning to insert an object instance into an image in a semantically coherent manner is a challenging and interesting problem. Solving it requires (a) determining a location to place an object in the scene and (b) determining its appearance at the location. Such an object insertion model can potentially facilitate numerous image editing and scene parsing applications. In this paper, we propose an end-to-end trainable neural network for the task of inserting an object instance mask of a specified class into the semantic label map of an image.