Results
[1] Perazzi et al., A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
[2] Mehl et al., Spring: A high-resolution high-detail dataset and benchmark for scene flow, optical flow and stereo, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Here are some related works that also explore use of video foundation models for low-level 4D perception:
[1] Carreira et al., Scaling 4D Representations, arXiv, 2024.
[2] Hu et al., DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025.