Compared to baseline methods, GeoMan achieves both high-quality and temporal consistency.
Compared to baselines, GeoMan produces more temporally stable and high-quality depth.
* To ensure consistency, the predicted depth maps in all visualizations are renormalized using sequence-wise min-max scaling within the human mask.
@misc{kim2025geoman,
title={GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion},
author={Gwanghyun Kim, Xueting Li, Ye Yuan, Koki Nagano, Tianye Li, Jan Kautz, Se Young Chun, Umar Iqbal},
year={2025},
eprint={2505.23085},
archivePrefix={arXiv},
primaryClass={cs.CV}
}