FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments

We introduce the Few-Shot Object Learning (FewSOL) dataset for object recognition with a few images per object. We captured 336 real-world objects with 9 RGBD images per object from different views. FewSOL has object segmentation masks, poses, and attributes. In addition, synthetic images generated using 330 3D object models are used to augment the dataset. We investigated (i) few-shot object classification and (ii) joint object segmentation and few-shot classification with state-of-the-art methods for few-shot learning and meta-learning using our dataset.

Key-Locked Rank One Editing for Text-to-Image Personalization

Text-to-image models (T2I) offer a new level of flexibility by allowing users to guide the creative process through natural language. However, personalizing these models to align with user-provided visual concepts remains a challenging problem. The task of T2I personalization poses multiple hard challenges, such as maintaining high visual fidelity while allowing creative control, combining multiple personalized concepts in a single image, and keeping a small model size.

Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models

Text-to-image personalization aims to teach a pre-trained diffusion model to reason about novel, user provided concepts, embedding them into new scenes guided by natural language prompts. However, current personalization approaches struggle with lengthy training times, high storage requirements or loss of identity. To overcome these limitations, we propose an encoder-based domain-tuning approach.

CALM: Conditional Adversarial Latent Models for Directable Virtual Characters

in this work, we present Conditional Adversarial Latent Models (CALM), an approach for generating diverse and directable behaviors for user-controlled interactive virtual characters. Using imitation learning, CALM learns a representation of movement that captures the complexity and diversity of human motion, and enables direct control over character movements. The approach jointly learns a control policy and a motion encoder that reconstructs key characteristics of a given motion without merely replicating it.

Data Free Learning of Reduced-Order Kinematics

Physical systems ranging from elastic bodies to kinematic linkages are defined on a high-dimensional configuration spaces, yet their typical low-energy configurations are concentrated on much lower-dimensional subspaces. This work addresses the challenge of identifying such subspaces automatically: given as input an energy function for a high-dimensional system, we produce a low-dimensional map whose image parameterizes a diverse yet low-energy submanifold of configurations.

Parallel Inversion of Neural Radiance Fields for Robust Pose Estimation

We present a parallelized optimization method based on fast Neural Radiance Fields (NeRF) for estimating 6-DoF target poses. Given a single observed RGB image of the target, we can predict the translation and rotation of the camera by minimizing the residual between pixels rendered from a fastNeRF model and pixels in the observed image. We integrate a momentum-based camera extrinsic optimization procedure intoInstant Neural Graphics Primitives, a recent exceptionally fastNeRF implementation.

Random-Access Neural Compression of Material Textures

The continuous advancement of photorealism in rendering is accompanied by a growth in texture data and, consequently, increasing storage and memory demands. To address this issue, we propose a novel neural compression technique specifically designed for material textures. We unlock two more levels of detail, i.e., 16× more texels, using low bitrate compression, with image quality that is better than advanced image compression techniques, such as AVIF and JPEG XL.

Walk on Stars: A Grid-Free Monte Carlo Method for PDEs with Neumann Boundary Conditions

Grid-free Monte Carlo methods based on the walk on spheres (WoS) algorithm solve fundamental partial differential equations (PDEs) like the Poisson equation without discretizing the problem domain, nor approximating functions a finite basis. Such methods hence avoid aliasing in the solution, and evade the many challenges of mesh generation. Yet for problems with complex geometry, practical grid-free methods have been largely limited to basic Dirichlet boundary conditions.

Synthesizing Physical Character-Scene Interactions

In this work, we present a system that uses adversarial imitation learning and reinforcement learning to train physically-simulated characters that perform scene interaction tasks in a natural and life-like manner. Our method learns scene interaction behaviors from large unstructured motion datasets, without manual annotation of the motion data. These scene interactions are learned using an adversarial discriminator that evaluates the realism of a motion within the context of a scene.

Interactive Hair Simulation on the GPU Using ADMM

We devise a local–global solver dedicated to the simulation of Discrete Elastic Rods (DER) with Coulomb friction that can fully leverage the massively parallel compute capabilities of moderns GPUs. We verify that our simulator can reproduce analytical results on recently published cantilever, bend–twist, and stick–slip experiments, while drastically decreasing iteration times for high-resolution hair simulations.