We aim to generate fine-grained 3D geometry from large-scale sparse LiDAR scans, abundantly captured by autonomous vehicles (AV). Contrary to prior work on AV scene completion, we aim to extrapolate fine geometry from unlabeled and beyond spatial limits of LiDAR scans, taking a step towards generating realistic, high-resolution simulation-ready 3D street environments. We propose hierarchical Generative Cellular Automata (hGCA), a spatially scalable conditional 3D generative model, which grows geometry recursively with local kernels following GCAs, in a coarse-to-fine manner, equipped with a light-weight planner to induce global consistency. Experiments on synthetic scenes show that hGCA generates plausible scene geometry with higher fidelity and completeness compared to state-of-the-art baselines. Our model generalizes strongly from sim-to-real, qualitatively outperforming baselines on the Waymo-open dataset. We also show anecdotal evidence of the ability to create novel objects from real-world geometric cues even when trained on limited synthetic content.
The proposed model, hGCA, can generate realistic large outdoor scenes with high fidelity from accumulated LiDAR scans. Below, we show extrapolation results of synthetic dataset from CARLA (top 3 rows) and Karton City (bottom 3 rows) with models trained from mixture of the two datasets. Our method can complete input scans with high fidelity and extrapolate beyond the field of view, better than the prior methods. Hover over each image to see the zoom-in in full size.
hGCA generalizes well to real-world LiDAR scans. Below, we demonstrate completion on real-world Waymo-open LiDAR scans, where the model is trained on synthetic datasets shown above. hGCA can generate more complete geometry than accumulated LiDAR scans, which has limited height range and suffers from occlusion. Hover over each image to see the zoom-in in full size.
Our model is spatially scalable, able to complete a 100 meter scene with a single 24GB GPU. hGCA can even extrapolate hills (bottom) from real-world scans. Slide to compare the input (yellow) and the generation (blue).
We visualize the 2 stage coarse-to-fine generation process of hGCA.
Please also checkout Generative Cellular Automatas (GCAs), which we utilize for our scalable scene generation and X-Cube , a concurrent work for 3D scene-level generative model.
@InProceedings{Zhang_2024_CVPR,
author={Zhang, Dongsu and Williams, Francis and Gojcic, Zan and Kreis, Karsten and
Fidler, Sanja and Kim, Young Min and Kar, Amlan},
title={Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition (CVPR)},
month={June},
year={2024},
pages={20145-20154}
}
This work was in part supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) [NO.2021-0-01343, Artificial Intelligence Graduate School Program (Seoul National University)] and Creative-Pioneering Researchers Program through Seoul National University.
|