Toronto AI Lab
3DStyleNet: Creating 3D Shapes with Geometric and Texture Style Variations

3DStyleNet: Creating 3D Shapes with Geometric and Texture Style Variations

1NVIDIA
2University of Toronto
3Vector Institute
ICCV 2021 (Oral)

We propose 3DStyleNet, a neural stylization method for 3D textured shapes. Our method creates novel geometric and texture variations of 3D objects by transferring the shape and texture style from one 3D object (target) to another (source).

Abstract


We propose a method to create plausible geometric and texture style variations of 3D objects in the quest to democratize 3D content creation. Given a pair of textured source and target objects, our method predicts a part-aware affine transformation field that naturally warps the source shape to imitate the overall geometric style of the target. In addition, the texture style of the target is transferred to the warped source object with the help of a multi-view differentiable renderer. Our model, 3DStyleNet, is composed of two sub-networks trained in two stages. First, the geometric style network is trained on a large set of untextured 3D shapes. Second, we jointly optimize our geometric style network and a pre-trained image style transfer network with losses defined over both the geometry and the rendering of the result. Given a small set of high-quality textured objects, our method can create many novel stylized shapes, resulting in effortless 3D content creation and style-ware data augmentation. We showcase our approach qualitatively on 3D content stylization, and provide user studies to validate the quality of our results. In addition, our method can serve as a valuable tool to create 3D data augmentations for computer vision tasks. Extensive quantitative analysis shows that 3DStyleNet outperforms alternative data augmentation techniques for the downstream task of single-image 3D reconstruction.

3D Content Creation


Our method v.s. a strong baseline that combines NeuralCage + Linear Image Style Transfer. Notice that our method better captures the style in both geometry and texture. See for example the 5th column of the animal subset. While the baseline simply enlarges the dog's head, our method jointly stylizes both geometry and texture to achieve the cartoon look of the target object.



Our method supports linear shape interpolation between the source shape and the stylized output: for geometry, we linearly interpolate the vertex displacements produced by the part-ware affine transformation. For the texture, we linearly interpolate the vgg features and decode them to get the interpolated texture images. The interpolation is fast and can be done in real-time.

3D Data Augmentation


3DStyleNet can be used as a 3D Data Augmentation tool for downstream computer vision tasks. Here we show quantitative results on Single Image 3D reconstruction using DISN as the 3D reconstruction method and 3DStyleNet compared with baselines as a 3D data augmentation strategy.

Qualitative results on Single Image 3D reconstruction using DISN as the 3D reconstruction method and various 3D Data Augmentation strategies. While none of the results are perfect, some are clearly worse than others. Affine randomization hurts performance. No augmentation produces worst results than the remaining augmentation strategies. Ours produces the most plausible and smooth shapes.

Qualitative results on Single Image 3D reconstruction on SMAL Dataset using DISN as the 3D reconstruction method and 3DStyleNet as a 3D Data Augmentation Strategy . Note that the background is masked out with the provided segmentation in the Dataset.

Citation


@inproceedings{yin2021_3DStyleNet,
    title = {3DStyleNet: Creating 3D Shapes with Geometric and Texture Style Variations}, 
    author = {Kangxue Yin and Jun Gao and Maria Shugrina and Sameh Khamis and Sanja Fidler},
              booktitle = {Proceedings of International Conference on Computer Vision (ICCV)},
              year = {2021}
}

Paper


3DStyleNet: Creating 3D Shapes with Geometric and Texture Style Variations

Kangxue Yin, Jun Gao, Maria Shugrina, Sameh Khamis, Sanja Fidler

description Paper camera-ready
description arXiv version
description Supplement
insert_comment BibTeX