Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Yoad Tewel, Rinon Gal, Yuval Atzmon, Gal Chechik, Dvir Samuel, Lior Wolf

Nov 12, 2024 Research Project

Project Website arXiv Code

Add‑it is a training‑free approach for inserting objects into images from a simple text prompt. It extends diffusion model attention to incorporate information from three sources—the source image, the text prompt, and the generated image—using a weighted attention scheme, subject‑guided latent blending, and a noise structure transfer step.

Add‑it achieves state‑of‑the‑art results on real and generated image insertion benchmarks, and introduces the “Additing Affordance Benchmark” to evaluate object placement plausibility. The method produces realistic placements while preserving scene structure and fine details.

Diffusion Models Text-to-Image Image Editing

Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Yoad Tewel

Rinon Gal

Yuval Atzmon

Gal Chechik

Related