| Research

Learning Physically Simulated Tennis Players from Broadcast Videos

Motion capture (mocap) data has been the most popular data source for computer animation techniques that combine deep reinforcement learning and motion imitation to produce lifelike motions and perform diverse skills. However, mocap data for specialized skills can be costly to acquire at scale while there exists an enormous corpus of athletic motion data in the form of video recordings.

Read more about Learning Physically Simulated Tennis Players from Broadcast Videos

Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion

We introduce a method for generating realistic pedestrian trajectories and full-body animations that can be controlled to meet user-defined goals. We draw on recent advances in guided diffusion modeling to achieve test-time controllability of trajectories, which is normally only associated with rule-based systems. Our guided diffusion model allows users to constrain trajectories through target waypoints, speed, and specified social groups while accounting for the surrounding environment context.

Read more about Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion

A 0.297-pJ/Bit 50.4-Gb/s/Wire Inverter-Based Short-Reach Simultaneous Bi-Directional Transceiver for Die-to-Die Interface in 5-nm CMOS

This article presents a clock-forwarded, inverter-based short-reach simultaneous bi-directional (ISR-SBD) physical layer (PHY) targeted for die-to-die communication over silicon interposers or similar high-density interconnect. Short-reach links of this type are increasingly important to support larger systems built with chiplets and multiple die and to facilitate the shift to medium- and long-range optical communication based on silicon photonics. This project explores the advantages of simultaneous bi-directional signaling (SBD) over other bandwidth-doubling techniques (e.g., PAM4).

Read more about A 0.297-pJ/Bit 50.4-Gb/s/Wire Inverter-Based Short-Reach Simultaneous Bi-Directional Transceiver for Die-to-Die Interface in 5-nm CMOS

Luminance-Preserving and Temporally Stable Daltonization

We propose a novel, real-time algorithm for recoloring images to improve the experience for a color vision deficient observer.
The output is temporally stable and preserves luminance, the most important visual cue. It runs in 0.2 ms per frame on a GPU.

Read more about Luminance-Preserving and Temporally Stable Daltonization

Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation

The two-stage object pose estimation paradigm first detects semantic keypoints on the image and then estimates the 6D pose by minimizing reprojection errors. Despite performing well on standard benchmarks, existing techniques offer no provable guarantees on the quality and uncertainty of the estimation. In this paper, we inject two fundamental changes, namely conformal keypoint detection and geometric uncertainty propagation, into the two-stage paradigm and propose the first pose estimator that endows an estimation with provable and computable worst-case error bounds.

Read more about Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation

FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization

Novel view synthesis with sparse inputs is a challenging problem for neural radiance fields (NeRF). Recent efforts alleviate this challenge by introducing external supervision, such as pre-trained models and extra depth signals, and by non-trivial patch-based rendering. In this paper, we present Frequency regularized NeRF (FreeNeRF), a surprisingly simple baseline that outperforms previous methods with minimal modifications to the plain NeRF. We analyze the key challenges in few-shot neural rendering and find that frequency plays an important role in NeRF’s training.

Read more about FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization

A 95.6-TOPS/W Deep Learning Inference Accelerator With Per-Vector Scaled 4-bit Quantization in 5 nm

The energy efficiency of deep neural network (DNN) inference can be improved with custom accelerators. DNN inference accelerators often employ specialized hardware techniques to improve energy efficiency, but many of these techniques result in catastrophic accuracy loss on transformer-based DNNs, which have become ubiquitous for natural language processing (NLP) tasks. This article presents a DNN accelerator designed for efficient execution of transformers.

Read more about A 95.6-TOPS/W Deep Learning Inference Accelerator With Per-Vector Scaled 4-bit Quantization in 5 nm

Heterogeneous-Agent Trajectory Forecasting Incorporating Class Uncertainty

Reasoning about the future behavior of other agents is critical to safe robot navigation. The multiplicity of plausible futures is further amplified by the uncertainty inherent to agent state estimation from data, including positions, velocities, and semantic class. Forecasting methods, however, typically neglect class uncertainty, conditioning instead only on the agent's most likely class, even though perception models often return full class distributions.

Read more about Heterogeneous-Agent Trajectory Forecasting Incorporating Class Uncertainty

On Legalization of Die Bonding Bumps and Pads for 3D ICs

State-of-the-art 3D IC Place-and-Route flows were designed with older technology nodes and aggressive bonding pitch assumptions. As a result, these flows fail to honor the width and spacing rules for the 3D vias with realistic pitch values. We propose a critical new 3D via legalization stage during routing to reduce such violations. A force-based solver and bipartite-matching algorithm with Bayesian optimization are presented as viable legalizers and are compatible with various process nodes, bonding technologies, and partitioning types.