Toronto AI Lab

Meta Sim
Learning to Generate Synthetic Datasets

Amlan Kar 1,2,3
Aayush Prakash1
Ming-Yu Liu1
Eric Cameracci1
Justin Yuan1

Matt Rusiniak1
David Acuna1,2,3
Antonio Torralba4
Sanja Fidler1,2,3

2University of Toronto
3Vector Institute
ICCV 2019 (Oral)

Training models to high-end performance requires availability of large labeled datasets, which are expensive to get. The goal of our work is to automatically synthesize labeled datasets that are relevant for a downstream task. We propose Meta-Sim, which learns a generative model of synthetic scenes, and obtain images as well as its corresponding ground-truth via a graphics engine. We parametrize our dataset generator with a neural network, which learns to modify attributes of scene graphs obtained from probabilistic scene grammars, so as to minimize the distribution gap between its rendered outputs and target data. If the real dataset comes with a small labeled validation set, we additionally aim to optimize a meta-objective, i.e. downstream task performance. Experiments show that the proposed method can greatly improve content generation quality over a human-engineered probabilistic scene grammar, both qualitatively and quantitatively as measured by performance on a downstream task.



Amlan Kar, Aayush Prakash, Ming-Yu Liu, Eric Cameracci,
Justin Yuan, Matt Rusiniak, David Acuna, Antonio Torralba,
Sanja Fidler

Meta-Sim: Learning to Generate Synthetic Datasets

ICCV, 2019. (Oral) (to appear)


Training Stages


Input Prob. Grammar Meta-Sim KITTI Dataset

(left) samples from our prob. grammar, (middle) Meta-Sim’s corresponding samples, (right) random samples from KITTI

Visualization of Meta-Sim's outputs through its training

(Left) Detection results after training on our prob. grammar vs. (Right) Detection results after training with meta-sim

This webpage template was borrowed from Richard Zhang.