1. [Publications](/publications)
2. FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
 
 # FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

  ![](/sites/default/files/styles/wide/public/publications/intro_0.jpg?itok=27O5RJop)

 We present FoundationPose, a unified foundation model for 6D object pose estimation and tracking, supporting both model-based and model-free setups. Our approach can be instantly applied at test-time to a novel object without fine-tuning, as long as its CAD model is given, or a small number of reference images are captured. We bridge the gap between these two setups with a neural implicit representation that allows for effective novel view synthesis, keeping the downstream pose estimation modules invariant under the same unified framework. Strong generalizability is achieved via large-scale synthetic training, aided by a large language model (LLM), a novel transformer-based architecture, and contrastive learning formulation. Extensive evaluation on multiple public datasets involving challenging scenarios and objects indicate our unified approach outperforms existing methods specialized for each task by a large margin. In addition, it even achieves comparable results to instance-level methods despite the reduced assumptions.



 ## Authors



[Bowen Wen](/person/bowen-wen)

[Wei Yang](/person/wei-yang)

[Jan Kautz](/person/jan-kautz)

[Stan Birchfield](/person/stan-birchfield)

 

 

 ## Publication Date



Saturday, June 1, 2024

 

 ## Published in



[CVPR 2024](https://cvpr.thecvf.com/Conferences/2024)

 

 ## Research Area



[Applied Perception](/research-area/applied-perception)

[Computer Graphics](/research-area/computer-graphics)

[Computer Vision](/research-area/computer-vision)

[Robotics](/research-area/robotics)

[VR, AR and Display Technology](/research-area/virtual-augmented-reality)

 

 

 ## External Links



[project page](https://nvlabs.github.io/FoundationPose/)

[paper](https://arxiv.org/abs/2312.08344)