fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence

We present fVDB, a novel GPU-optimized framework for deep learning on large-scale 3D data. fVDB provides a complete set of differentiable primitives to build deep learning architectures for common tasks in 3D learning such as convolution, pooling, attention, ray-tracing, meshing, etc. fVDB simultaneously provides a much larger feature set (primitives and operators) than established frameworks with no loss in efficiency: our operators match or exceed the performance of other frameworks with narrower scope. Furthermore, fVDB can process datasets with much larger footprint and spatial resolution than prior works, while providing a competitive memory footprint on small inputs. To achieve this combination of versatility and performance, fVDB relies on a single novel VDB index grid acceleration structure paired with several key innovations including GPU accelerated sparse grid construction, convolution using tensor cores, fast ray tracing kernels using HDDA, and jagged tensors. Our framework is fully integrated with PyTorch enabling interoperability with existing pipelines, and we demonstrate its effectiveness on a number of representative tasks such as large-scale point-cloud segmentation, high resolution 3D generative modeling, unbounded scale Neural Radiance Fields, and large-scale point cloud reconstruction.

Authors

Francis Williams (NVIDIA)

Jiahui Huang (NVIDIA)

Jonathan Swartz (NVIDIA)

Gergely Klar (NVIDIA)

Vijay Thakkar (NVIDIA)

Matthew Cong (NVIDIA)

Xuanchi Ren (NVIDIA)

Ruilong Li (NVIDIA)

Clement Fuji-Tsang (NVIDIA)

Sanja Fidler (NVIDIA)

Eftychios Sifakis (NVIDIA)

Ken Museth (NVIDIA)

Publication Date

Monday, July 1, 2024

Research Area

Computer Graphics

Generative AI

External Links

Project Page