Train Hard, Fight Easy: Robust Meta Reinforcement Learning

Publication
Advances in Neural Information Processing Systems, 2023.

We introduce RoML - a meta-algorithm that takes any meta-learning baseline algorithm and generates a robust version of it.

Teaser Figure
A test task corresponding to high body mass, which is typically more difficult to control. RoML (right) learns to handle the high mass by leaning forward and letting gravity do the hard work, leading to higher velocities than the baseline VariBAD (left).

Abstract

A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients. Meta-RL (MRL) addresses this issue by learning a meta-policy that adapts to new tasks. Standard MRL methods optimize the average return over tasks, but often suffer from poor results in tasks of high risk or difficulty. This limits system reliability whenever test tasks are not known in advance. In this work, we propose a robust MRL objective with a controlled robustness level. Optimization of analogous robust objectives in RL often leads to both biased gradients and data inefficiency. We prove that the former disappears in MRL, and address the latter via the novel Robust Meta RL algorithm (RoML). RoML is a meta-algorithm that generates a robust version of any given MRL algorithm, by identifying and over-sampling harder tasks throughout training. We demonstrate that RoML learns substantially different meta-policies and achieves robust returns on several navigation and continuous control benchmarks.

Cite the paper

@InProceedings{roml,
  title = 	 {Train Hard, Fight Easy: Robust Meta Reinforcement Learning},
  author =       {Greenberg, Ido and Mannor, Shie and Chechik, Gal and Meirom, Eli},
  booktitle = 	 {Advances in Neural Information Processing Systems},
  year = 	 {2023},
  publisher =    {PMLR},
}

Related