1. [Publications](/publications)
2. BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations
 
 # BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations

  ![Publication image](/sites/default/files/styles/wide/public/default_images/default.jpeg?itok=qUFsuJCP "Publication image")

 Annotating images with pixel-wise labels is a time-consuming and costly process. Recently, DatasetGAN showcased a promising alternative - to synthesize a large labeled dataset via a generative adversarial network (GAN) by exploiting a small set of manually labeled, GAN-generated images. Here, we scale DatasetGAN to ImageNet scale of class diversity. We take image samples from the class-conditional generative model BigGAN trained on ImageNet, and manually annotate 5 images per class, for all 1k classes. By training an effective feature segmentation architecture on top of BigGAN, we turn BigGAN into a labeled dataset generator. We further show that VQGAN can similarly serve as a dataset generator, leveraging the already annotated data. We create a new ImageNet benchmark by labeling an additional set of 8k real images and evaluate segmentation performance in a variety of settings. Through an extensive ablation study we show big gains in leveraging a large generated dataset to train different supervised and self-supervised backbone models on pixel-wise tasks. Furthermore, we demonstrate that using our synthesized datasets for pre-training leads to improvements over standard ImageNet pre-training on several downstream datasets, such as PASCAL-VOC, MS-COCO, Cityscapes and chest X-ray, as well as tasks (detection, segmentation). Our benchmark will be made public and maintain a leaderboard for this challenging task. Project Page: <https://research.nvidia.com/labs/toronto-ai/big-datasetgan/>


 ## Authors


Daiqing Li (NVIDIA)

Huan Ling (NVIDIA, University of Toronto, Vector Institute)

Seung Wook Kim (NVIDIA, University of Toronto, Vector Institute)

[Karsten Kreis](/person/karsten-kreis)

Adela Barriuso

Sanja Fidler (NVIDIA, University of Toronto, Vector Institute)

Antonio Torralba (MIT)

 
 ## Publication Date


Sunday, June 19, 2022

 
 ## Published in


[IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022](https://arxiv.org/abs/2201.04684)

 
 ## Research Area


[Artificial Intelligence and Machine Learning ](/research-area/machine-learning-artificial-intelligence)

[Computer Vision](/research-area/computer-vision)

[Generative AI](/research-area/generative-ai)

 
 ## External Links


[Project Website](https://research.nvidia.com/labs/toronto-ai/big-datasetgan/)