| Research

Guided Deep Kernel Learning

Combining Gaussian processes with the expressive power of deep neural networks is commonly done nowadays through deep kernel learning (DKL). Unfortunately, due to the kernel optimization process, this often results in losing their Bayesian benefits. In this study, we present a novel approach for learning deep kernels by utilizing infinite-width neural networks. We propose to use the Neural Network Gaussian Process (NNGP) model as a guide to the DKL model in the optimization process.

Read more about Guided Deep Kernel Learning

Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models

Text-to-image personalization aims to teach a pre-trained diffusion model to reason about novel, user provided concepts, embedding them into new scenes guided by natural language prompts. However, current personalization approaches struggle with lengthy training times, high storage requirements or loss of identity. To overcome these limitations, we propose an encoder-based domain-tuning approach.

Read more about Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models

Graph Positional Encoding via Random Feature Propagation

Two main families of node feature augmentation schemes have been explored for enhancing GNNs: random features and spectral positional encoding. Surprisingly, however, there is still no clear understanding of the relation between these two augmentation schemes. Here we propose a novel family of positional encoding schemes which draws a link between the above two approaches and improves over both. The new approach, named Random Feature Propagation (RFP), is inspired by the power iteration method and its generalizations.

Read more about Graph Positional Encoding via Random Feature Propagation

Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models

Text-to-image (T2I) personalization allows users to guide the creative image generation process by combining their own visual concepts in natural language prompts. Recently, encoder-based techniques have emerged as a new effective approach for T2I personalization, reducing the need for multiple images and long training times. However, most existing encoders are limited to a single-class domain, which hinders their ability to handle diverse concepts.

Read more about Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models

Breathing Life Into Sketches Using Text-to-Video Priors

A sketch is one of the most intuitive and versatile tools humans use to convey their ideas visually. An animated sketch opens another dimension to the expression of ideas and is widely used by designers for a variety of purposes. Animating sketches is a laborious process, requiring extensive experience and professional design skills. In this work, we present a method that automatically adds motion to a single-subject sketch (hence, "breathing life into it"), merely by providing a text prompt indicating the desired motion.

Read more about Breathing Life Into Sketches Using Text-to-Video Priors

Xin Dong

Xin Dong received his Ph.D. from Harvard University. He is a recipient of the Harvard James Miller Peirce Fellowship.

He has general research interests in deep learning, with a focus on designing accurate, efficient and trustworthy systems for autonomous machines, LLM and GenAI.

Read more about Xin Dong

Michael Isaev

Read more about Michael Isaev

Evaluating and Improving Rendered Visual Experiences: Metrics, Compression, Higher Frame Rates & Recoloring

Rendered imagery is presented to us daily. Special effects in movies, video games, scientific visualizations, and marketing catalogs all often rely on images generated through computer graphics. However, with all the possibilities that rendering offers come also a plethora of challenges. This thesis proposes novel ways of evaluating the visual errors caused when some of those challenges are not completely overcome. The thesis also suggests ways to improve on the visual experience observers have when viewing rendered content.

Read more about Evaluating and Improving Rendered Visual Experiences: Metrics, Compression, Higher Frame Rates & Recoloring

Anqi Li

Anqi Li completed her Ph.D. in Computer Science & Engineering from the University of Washington in 2023. Before that, she received her Master's degree from Carnegie Mellon University and her Bachelor's degree from Zhejiang University. Her research focuses on bringing sample efficiency and formal performance guarantees to robot learning. Specific research topics include offline reinforcement learning, safe reinforcement learning, learning from human demonstrations, and planning & control with guarantees.

Read more about Anqi Li

Reena Elangovan

Read more about Reena Elangovan

Subscribe to