Learning and Perception Research Group
ZeroMSF

Zero-shot Monocular Scene Flow Estimation in the Wild

1 NVIDIA Research
2 Brown University
* Indicates Equal Contribution

Abstract


Large models have shown generalization across datasets for many low-level vision tasks, like depth estimation, but no such general models exist for scene flow. Even though scene flow has wide potential use, it is not used in practice because current predictive models do not generalize well. We identify three key challenges and propose solutions for each. First, we create a method that jointly estimates geometry and motion for accurate prediction. Second, we alleviate scene flow data scarcity with a data recipe that affords us 1M annotated training samples across diverse synthetic scenes. Third, we evaluate different parameterizations for scene flow prediction and adopt a natural and effective parameterization. Our resulting model outperforms existing methods as well as baselines built on large-scale models in terms of 3D end-point error, and shows zero-shot generalization to the casually captured videos from DAVIS and the robotic manipulation scenes from RoboTAP. Overall, our approach makes scene flow prediction practical in-the-wild.

Experimental Results

Results on the casual DAVIS Dataset


dog gold-fish judo kite-surf


Results on the robotics RoboTAP Dataset


gear1-basket_front_right_rgb_img gear_v2_3-basket_front_right_rgb_img-0 gearsblocks1-basket_back_left_rgb_img rgb2-obs_basket_back_left_pixels


Results on the driving KITTI Dataset


000012 000058 000136 000137


Paper


Zero-shot Monocular Scene Flow Estimation in the Wild

Yiqing Liang, Abhishek Badki*, Hang Su*, James Tompkin, Orazio Gallo

description Paper
description BibTeX

Citation



        @misc{liang2025zeroshotmonocularsceneflow,
          title={Zero-Shot Monocular Scene Flow Estimation in the Wild}, 
          author={Yiqing Liang and Abhishek Badki and Hang Su and James Tompkin and Orazio Gallo},
          year={2025},
          eprint={2501.10357},
          archivePrefix={arXiv},
          primaryClass={cs.CV},
          url={https://arxiv.org/abs/2501.10357}, 
        }