Novel View Synthesis of Dynamic Scenes with Globally Coherent Depths

This paper presents a new method to synthesize an image from the arbitrary view and time given a collection of images of a dynamic scene. A key challenge for the synthesis arises from dynamic scene reconstruction where epipolar geometry does not apply to the local motion of dynamic contents. Our insight is that although its scale and quality is inconsistent with other views, the depth estimation from a single view can be used to reason about the geometry of the local motion.

Prescription AR: a fully-customized prescription-embedded augmented reality display

In this paper, we present a fully-customized AR display design that considers the user’s prescription, interpupillary distance, and taste of fashion. A free-form image combiner embedded inside the prescription lens provides augmented images onto the vision-corrected real world. The optics was optimized for each prescription level, which can reduce the mass production cost while satisfying the user’s taste. The foveated optimization method was applied which distributes the pixels in accordance with human visual acuity.

Patch scanning displays: spatiotemporal enhancement for displays

Emerging fields of mixed reality and electronic sports necessitate greater spatial and temporal resolutions in displays. We introduce a novel scanning display method that enhances spatiotemporal qualities of displays. Specifically, we demonstrate that scanning multiple image patches that are representing basis functions of each block in a target image can help to synthesize spatiotemporally enhanced visuals. To discover the right image patches, we introduce an optimization framework tailored to our hardware.

ML-based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection

The safety and resilience of fully autonomous vehicles (AVs) are of significant concern, as exemplified by several headline-making accidents. While AV development today involves verification, validation, and testing, end-to-end assessment of AV systems under accidental faults in realistic driving scenarios has been largely unexplored. This paper presents DriveFI, a machine learning-based fault injection engine, which can mine situations and faults that maximally impact AV safety, as demonstrated on two industry-grade AV technology stacks (from NVIDIA and Baidu).

MAGNet: A Modular Accelerator Generator for Neural Networks

Deep neural networks have been adopted in a wide range of application domains, leading to high demand for inference accelerators. However, the high cost associated with ASIC hardware design makes it challenging to build custom accelerators for different targets. To lower design cost, we propose MAGNet, a modular accelerator generator for neural networks.

Joint Optimization for Cooperative Image Captioning

When describing images with natural language, descriptions can be made more informative if tuned for downstream tasks. This can be achieved by training two networks: a “speaker” that generates sentences given an image and a “listener” that uses them to perform a task. Unfortunately, training multiple networks jointly to communicate, faces two major challenges. First, the descriptions generated by a speaker network are discrete and stochastic, making optimization very hard and inefficient.

The Even/Odd Synchronizer: A Fast, All-Digital Periodic Synchronizer

We describe an all-digital synchronizer that moves multi-bit signals between two periodic clock domains with an average delay of slightly more than a half cycle and an arbitrarily small probability of synchronization failure. The synchronizer operates by measuring the relative frequency of the two periodic clocks and using this frequency measurement, along with a phase detection, to compute a phase estimate. Interval arithmetic is used for the phase estimate to account for uncertainty.

NRMVS: Non-Rigid Multi-view Stereo

Multi-view Stereo (MVS) is a common solution in photogrammetry applications for the dense reconstruction of a static scene from images. The static scene assumption, however, limits the general applicability of MVS algorithms, as many day-to-day scenes undergo non-rigid motion, e.g., clothes, faces, or human bodies. In this paper, we open up a new challenging direction: Dense 3D reconstruction of scenes with non-rigid changes observed from a small number of images sparsely captured from different views with a single monocular camera, which we call non-rigid multi-view stereo (NRMVS) problem.