Scroll

Auto Discover /skills for Robots

Scroll

Method: How ASPIRE Works

ASPIRE is a self-improving continual learning system for robotics that writes and refines code-as-policy robot control programs from execution feedback. It inspects rollout traces, diagnoses failures, repairs programs, validates corrected behaviors, and saves reusable skills for future tasks. ASPIRE operates in an open-ended learning loop with three key components: a closed-loop robot execution engine, a continually expanding skill library, and an evolutionary search procedure.

Closed-Loop Robot Execution Engine

For each perception, planning, grasping, and control call, the engine records the observations, inputs, outputs, and visual evidence if possible. These rich multimodal traces allow the agent to selectively inspect salient primitive logs, progressively localize failures, and validate repairs through re-execution.

Continually Expanding Skill Library

ASPIRE maintains a growing skill library that distills validated fixes into modular, transferable robotic knowledge retrievable as in-context guidance for future tasks.

Evolutionary Search Over Programs

ASPIRE employs an evolutionary search procedure that generates diverse task sequences and control programs, exploring beyond single-trajectory self-improvement through iterative debugging and parallel refinement.

Skill Library Gallery

Explore the expansive skill library learned by ASPIRE. The gallery is organized into 18 categories, each containing multiple skills. This is a representative sample, not an exhaustive list.

Benchmark Results

Main Evaluation Results

Across all benchmarks, an environment seed fixes each task instance, including object poses, distractors, and initial robot/object states. We use disjoint debug and evaluation seeds: ASPIRE learns on a small debug split, then reports success on larger held-out evaluation seeds with one generated program per LIBERO-Pro/Robosuite task, while CaP-Agent0 regenerates a separate program for each seed with test-time reasoning and retries. For BEHAVIOR-1K evaluation, ASPIRE uses incremental block execution, generating each next code block from the current multimodal trace.

LIBERO-PRO

ASPIRE writes control programs that generalize across object types, object positions, and task details.

Robosuite

ASPIRE handles contact-heavy tabletop manipulation tasks. Most notably, ASPIRE improved two-arm handover success rate from the baseline of 20% to 92% through iterative debugging.

BEHAVIOR-1K

For long-horizon mobile manipulation, we show navigation separately from task completion.

Zero-Shot Transfer of Skill Library

Scaling with Skill-Library Size · LIBERO-Pro Long

Skills learned across LIBERO-90 carry over to held-out long-horizon tasks. As the skill library grows, coding agents achieve higher success rate on LIBERO-Long.

Real-Robot Cross-Embodiment Skill Transfer

For each task, we compare real-robot debugging with and without retrieving a corresponding simulation-discovered skill from ASPIRE’s library. Token counts are measured until the first successful real-robot program. Success Rate reports held-out evaluation trials of the code that achieved that first success. When equipped with ASPIRE skills, the coding agent reaches its first success with fewer tokens, and the generated code achieves a higher success rate.

Output tokens (M)

Total tokens (M)

Held-out success rate (%)

Limitations

  • Not yet a fully autonomous real-world learner. Real-world deployment still needs robust success detection, safe resets, safety monitoring, and calibration.
  • Dependence on a frozen frontier LLM. ASPIRE relies on a frontier model (Claude Opus 4.6); we have not verified that smaller or weaker LLMs can sustain the debugging loop.
  • Bounded by a predefined primitive API. The fixed set of perception, planning, and control primitives keeps debugging safe but limits expressible behaviors.
  • Incomplete long-term memory management. The skill library currently prioritizes validated reusable repairs but does not fully solve long-term memory management. As the library grows, some entries may become stale, overly specific, redundant, or misleading for a new task.
  • Compute-intensive search loop. The debug and evolutionary-search loop costs many LLM calls and rollouts per task, so scaling up will require cheaper inference or more sample-efficient search.

Conclusion

We present ASPIRE, a self-improving continual learning robotic system that autonomously writes and refines robot control programs while compounding experience into a reusable skill library. ASPIRE operates in an open-ended learning loop with three components: a closed-loop robot execution engine that exposes fine-grained multimodal traces, a continually expanding skill library that distills validated fixes into transferable knowledge, and an evolutionary search procedure that explores diverse task sequences and control programs. Across diverse benchmarks, ASPIRE substantially outperforms existing VLA and coding-agent baselines, demonstrates strong zero-shot transfer to unseen long-horizon tasks, and provides initial evidence that the skills discovered in sim can transfer across embodiment to significantly reduce real-robot programming token cost despite different robot embodiments and APIs.

Team

Runyu Lu
Runyu Lu*†
Yubo Wu
Yubo Wu*
Ethan Kou
Ethan Kou*
Max Fu
Max Fu
Wenli Xiao
Wenli Xiao
Ajay Mandlekar
Ajay Mandlekar
Yinzhen Xu
Yinzhen Xu
Guanya Shi
Guanya Shi
Ken Goldberg
Ken Goldberg
Ang Chen
Ang Chen
Mosharaf Chowdhury
Mosharaf Chowdhury
Yuke Zhu
Yuke Zhu
Linxi Fan
Linxi “Jim” Fan
Guanzhi Wang
Guanzhi Wang

*Equal contribution · Project leads

BibTeX

@article{lu2026aspire,
  title   = {ASPIRE: Agentic /Skills Discovery for Robotics},
  author  = {Runyu Lu and Yubo Wu and Ethan Kou and Max Fu and Wenli Xiao and
             Ajay Mandlekar and Yinzhen Xu and Guanya Shi and Ken Goldberg and
             Ang Chen and Mosharaf Chowdhury and Yuke Zhu and Linxi Fan and Guanzhi Wang},
  year    = {2026},
  journal = {arXiv preprint}
}

Acknowledgments

We thank Nadun Ranawaka, Jimmy Wu, Matin Furutan, Haotian Lin, Abhi Maddukuri, Yulu Gan, Matin Nikoui, and Yuqi Xie for their help with open-source support, real robot infrastructure, advice and guidance on the paper, website release, and experimental equipment. We are also grateful to the members of NVIDIA GEAR, UMich SymbioticLab, UsesysLab, UC Berkeley AUTOLab, CMU LeCAR Lab for their kind support.