Learning Linear Transformations for Fast Image and Video Style Transfer

Given a random pair of images, a universal style transfer method extracts the feel from a reference image to synthesize an output based on the look of a content image. Recent algorithms based on second-order statistics, however, are either computationally expensive or prone to generate artifacts due to the trade-off between image quality and run-time performance. In this work, we present an approach for universal style transfer that learns the transformation matrix in a data-driven fashion.

Throughput-oriented GPU memory allocation

Throughput-oriented architectures, such as GPUs, can sustain three orders of magnitude more concurrent threads than multicore architectures. This level of concurrency pushes typical synchronization primitives (e.g., mutexes) over their scalability limits, creating significant performance bottlenecks in modules, such as memory allocators, that use them.

A Style-Based Generator Architecture for Generative Adversarial Networks

We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis.

Manufacturing Application-Driven Foveated Near-Eye Displays

Traditional optical manufacturing poses a great challenge to near-eye display designers due to large lead times in the order of multiple weeks, limiting the abilities of optical designers to iterate fast and explore beyond conventional designs. We present a complete near-eye display manufacturing pipeline with a day lead time using commodity hardware.

Improving Temporal Antialiasing with Adaptive Ray Tracing

In this chapter, we discuss a pragmatic approach to real-time supersampling that extends commonly used temporal antialiasing techniques with adaptive ray tracing. The algorithm conforms to the constraints of a commercial game engine, removes blurring and ghosting artifacts associated with standard temporal antialiasing, and achieves quality approaching 16× supersampling of geometry, shading, and materials within the 16 ms frame budget required of most games.

Cool Patches: A Geometric Approach to Ray/Bilinear Patch Intersections

We find intersections between a ray and a nonplanar bilinear patch using simple geometrical constructs. The new algorithm improves the state of the art performance by over 6X and is faster than approximating a patch with two triangles.

Ray Tracing Gems

This book is a collection of articles focused on ray tracing techniques for serious practitioners. Like other "gems" books, it focuses on subjects commonly considered too advanced for introductory texts, yet rarely addressed by research papers.

Probabilistic AND-OR Attribute Grouping for Zero-Shot Learning

In zero-shot learning (ZSL), a classifier is trained to recognize visual classes without any image samples. Instead, it is given semantic information about the class, like a textual description or a set of attributes. Learning from attributes could benefit from explicitly modeling structure of the attribute space. Unfortunately, learning of general structure from empirical samples is hard with typical dataset sizes.
Here we describe LAGO, a probabilistic model designed to capture natural soft and-or relations across groups of attributes.

Adaptive Confidence Smoothing for Generalized Zero-Shot Learning

Generalized zero-shot learning (GZSL) is the problem of learning a classifier where some classes have samples and others are learned from side information, like semantic attributes or text description, in a zero-shot learning fashion (ZSL). Training a single model that operates in these two regimes simultaneously is challenging. Here we describe a probabilistic approach that breaks the model into three modular components, and then combines them in a consistent way. Specifically, our model consists of three classifiers: A "gating" model that makes soft decisions if a sample is from a "seen" class, and two experts: a ZSL expert, and an expert model for seen classes.

We address two main difficulties in this approach: How to provide an accurate estimate of the gating probability without any training samples for unseen classes; and how to use expert predictions when it observes samples outside of its domain. The key insight to our approach is to pass information between the three models to improve each one's accuracy, while maintaining the modular structure. We test our approach, adaptive confidence smoothing (COSMO), on four standard GZSL benchmark datasets and find that it largely outperforms state-of-the-art GZSL models. COSMO is also the first model that closes the gap and surpasses the performance of generative models for GZSL, even-though it is a light-weight model that is much easier to train and tune.

Notably, COSMO offers a new view for developing zero-shot models. Thanks to COSMO's modular structure, instead of trying to perform well both on seen and on unseen classes, models can focus on accurate classification of unseen classes, and later consider seen class models.