We introduce DMTet, a deep 3D conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels or noisy point cloud.
We introduce DMTet, a deep 3D conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels. It marries the merits of implicit and explicit 3D representations by leveraging a novel hybrid 3D representation. Compared to the current implicit approaches, which are trained to regress the signed distance values, DMTet directly optimizes for the reconstructed surface, which enables us to synthesize finer geometric details with fewer artifacts. Unlike deep 3D generative models that directly generate explicit representations such as meshes, our model can synthesize shapes with arbitrary topology. The core of DMTet includes a deformable tetrahedral grid that encodes a discretized signed distance function and a differentiable marching tetrahedra layer that converts the implicit signed distance representation to the explicit surface mesh representation. This combination allows joint optimization of the surface geometry and topology as well as generation of the hierarchy of subdivisions using reconstruction and adversarial losses defined explicitly on the surface mesh. Our approach significantly outperforms existing work on conditional shape synthesis from coarse voxel inputs, trained on a dataset of complex 3D animal shapes. Project page: https://nv-tlabs.github.io/DMTet/.
DMTet predicts the underlying surface parameterized by an implicit function encoded via a deformable tetrahedral grid. The underlying surface is converted into an explicit mesh with a Marching Tetrahedra (MT) algorithm, which we show is differentiable. Therefore, DMTet can jointly optimize the surface geometry and topology using losses defined expliciitly on the surface mesh.
Here we demonstrate this with a 2D example, where the loss is defined as the distance between extracted surface (shown in red) with ground truth point cloud (shown in purple).
3D Shape Synthesis from Coarse Voxels
DMTet generalizes to human-created low-resolution voxels collected online. Despite the fact that these human-created shapes (in yellow) have noticeable differences with our coarse voxels used in training, e.g., different ratios of body parts compared with our training shapes (larger head, thinner legs, longer necks), our model faithfully generates high-quality 3D details (in blue) conditioned on each coarse voxel – an exciting result.
Point Cloud 3D Reconstruction
Citation
@inproceedings{shen2021dmtet,
title = {Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis},
author = {Tianchang Shen and Jun Gao and Kangxue Yin and Ming-Yu Liu and Sanja Fidler},
year = {2021},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)}
}
Paper
Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis
Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, Sanja Fidler