Complete Anything in Lidar (CAL): Given a sparse Lidar point cloud, CAL localizes, reconstructs, and, optionally, recognizes objects in a zero-shot fashion. By providing a semantic class vocabulary of specific object classes at test time, CAL can be prompted to perform Semantic Scene Completion (SSC), Panoptic Scene Completion (PSC), or (amodal) 3D Object Detection.
Please note that CAL only takes a single Lidar scan as input; RGB images are shown for visualization purposes only.
Abstract
We propose CAL (Complete Anything in Lidar) for Lidar-based shape-completion in-the-wild. This is closely related to Lidar-based semantic/panoptic scene completion. However, contemporary methods can only complete and recognize objects from a closed vocabulary labeled in existing Lidar datasets. Different to that, our zero-shot approach leverages the temporal context from multi-modal sensor sequences to mine object shapes and semantic features of observed objects. These are then distilled into a Lidar-only instance-level completion and recognition model. Although we only mine partial shape completions, we find that our distilled model learns to infer full object shapes from multiple such partial observations across the dataset. We show that our model can be prompted on standard benchmarks for Semantic and Panoptic Scene Completion, localize objects as (amodal) 3D bounding boxes, and recognize objects beyond fixed class vocabularies.