Reinforcement Learning

Non-rectangular Robust MDPs with Normed Uncertainty Sets

On the Convergence of Single-Timescale Actor-Critic

Policy Optimized Text-to-Image Pipeline Design

State Entropy Regularization for Robust Reinforcement Learning

RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression

Video encoders optimize compression for human perception by minimizing reconstruction error under bit-rate constraints. In many modern applications such as autonomous driving, an overwhelming majority of videos serve as input for AI systems …

Gradient Boosting Reinforcement Learning

Policy Gradient via Tree Expansion

Real-Time Rate Control for Task-Aware Video Compression Using Reinforcement Learning

Global Convergence of Policy Gradient in Average Reward MDPs

MaskedMimic: Unified Physics-Based Character Control Through Masked Motion Inpainting

We introduce MaskedMimic a single unified controller for physically simulated humanoids. Our system is capable of generating a wide range of motions across diverse terrains from intuitive user-defined intents. In this work, we show several …