CAPA:
Depth Completion as Parameter-Efficient Test-Time Adaptation

TL;DR: CAPA is a framework for depth completion that adapts pre-trained depth models in test time. Given sparse geometric cues, it freezes the model backbone and uses parameter-efficient fine-tuning (like LoRA and VPT) to adapt to a specific sample (image or video). It works with any ViT-based model, and achieves state-of-the-art depth accuracy and temporal consistency.

CAPA Method Overview

Comparison with Baseline Methods

RGB+Condition
GT Depth
GT Points
Baseline Depth
Baseline Points
Ours Depth
Ours Points
00:00 00:00
Baseline Ours

Optimization Process

1/1

Applied to MoGe-2

CAPA improves both accuracy and (temporal) consistency beyond the base model.

1/1

Quantitative Results

Quantitative Results
1/1

Citation

TBD