LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation Models

Ziqi Lu, Heng Yang, Danfei Xu, Boyi Li, Boris Ivanovic, Marco Pavone, Yue Wang

December 2024

PDF Project

Abstract

Emerging 3D geometric foundation models, such as DUSt3R, offer a promising approach for in-the-wild 3D vision tasks. However, due to the high-dimensional nature of the problem space and scarcity of high-quality 3D data, these pre-trained models still struggle to generalize to many challenging circumstances, such as limited view overlap or low lighting. In this work, we propose LoRA3D, an efficient self-calibration pipeline to specialize pre-trained models to target scenes using their own multi-view predictions. Taking sparse RGB images as input, we leverage robust optimization techniques to refine multi-view predictions and align them into a global coordinate frame. Our method incorporates the prediction confidence into the geometric optimization process, automatically re-weighting the confidence to better reflect point estimation accuracy. We use the calibrated confidence to generate high-quality pseudo labels for calibrating views, and then fine-tune the models using low-rank adaptation (LoRA) on the pseudo-labeled data without requiring any external priors or manual labels. Our self-calibration process completes on a single standard GPU within just 5 minutes, and each low-rank adapter requires only 18MB of storage. We evaluated our method on more than 160 scenes from the Replica, TUM and Waymo Open datasets, achieving up to 88% performance improvement on 3D reconstruction, multi-view pose estimation and novel-view rendering. The code and data are available at our project page.

Type

Conference paper

Publication

ICLR 2025

LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation Models

Abstract

Related