1. [Publications](/publications)
2. StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects
 
 # StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects

  ![Publication image](/sites/default/files/styles/wide/public/default_images/default.jpeg?itok=qUFsuJCP "Publication image")

 Robots operating in human environments must be able to rearrange objects into semantically-meaningful configurations, even if these objects are previously unseen. We focus on the problem of building physically-valid structures without step-by-step instructions.

We propose StructDiffusion, which combines a diffusion model and an object-centric transformer to construct structures given partial-view point clouds and high-level language goals, such as *"set the table"* and *"make a line"*.

StructDiffusion improves success rate on assembling physically-valid structures out of unseen objects by on average 16% over an existing multi-modal transformer model, while allowing us to use one multi-task model to produce a wider range of different structures. We show experiments on held-out objects in both simulation and on real-world rearrangement tasks.



 ## Authors



 Weiyu Liu (Georgia Tech)

Yilun Du (MIT)

[Tucker Hermans](/person/tucker-hermans)

Sonia Chernova (Georgia Tech)

Chris Paxton (Meta AI)

 

 

 ## Publication Date



Saturday, July 1, 2023

 

 ## Published in



[Robotics: Science and Systems (RSS) 2023](https://roboticsconference.org/)

 

 ## Research Area



[Generative AI](/research-area/generative-ai)

[Robotics](/research-area/robotics)

 

 

 ## External Links



[StructDiffusion website](https://structdiffusion.github.io/)

 

 

 ## Uploaded Files



[StructDiffusion.pdf](https://d1qx31qr3h6wln.cloudfront.net/publications/StructDiffusion.pdf "Open file in new window")7.43 MB