Reconstruction and intrinsic decomposition of scenes from captured imagery would enable many applications such as relighting and virtual object insertion. Recent NeRF based methods achieve impressive fidelity of 3D reconstruction, but bake the lighting and shadows into the radiance field, while mesh-based methods that facilitate intrinsic decomposition through differentiable rendering have not yet scaled to the complexity and scale of outdoor scenes. We present a novel inverse rendering framework for large urban scenes capable of jointly reconstructing the scene geometry, spatially-varying materials, and HDR lighting from a set of posed RGB images with optional depth. Specifically, we use a neural field to account for the primary rays, and use an explicit mesh (reconstructed from the underlying neural field) for modeling secondary rays that produce higher-order lighting effects such as cast shadows. By faithfully disentangling complex geometry and materials from lighting effects, our method enables photorealistic relighting with specular and shadow effects on several outdoor datasets. Moreover, it supports physics-based scene manipulations such as virtual object insertion with ray-traced shadow casting.
Combined with other NVIDIA technology, FEGR is one component of the Neural Reconstruction Engine announced in GTC Sept 2022 Keynote.
Motivation.
Neural radiance fields (NeRFs) have recently emerged as a powerful neural reconstruction approach that enables photo-realistic novel-view synthesis.
To be compatible with modern graphics pipeline and support applications such as relighting and object insertion, recent works also build on top of neural fields and explore a full inverse rendering formulation.
However, due to the volumetric nature of the neural fields, it remains an open challenge for neural fields to represent higher order lighting effects such as cast shadows via ray-tracing.
In contrast to NeRF, explicit mesh representations are compatible with graphics pipeline and can effectively leverage classic graphics techniques for ray-tracing. While these methods demonstrate remarkable performance in a single-object setting, they are limited in resolution when scaling up to encompass larger scenes.
Overview. In this work, we combine the advantages of the neural fields and explicit mesh representations and propose FEGR, a new hybrid rendering pipeline for inverse rendering of large urban scenes.
Specifically, we first use the neural field to perform volumetric rendering of primary rays into a G-buffer that includes the surface normal, base color, and material parameters for each pixel. We then extract the mesh from the underlying signed distance field, and perform the shading pass in which we compute illumination by integrating over the hemisphere at the shading point using Monte Carlo ray tracing.
The two-step hybrid rendering corresponds to the two passes in deferred shading, alleviating the rendering cost by leveraging mesh-based ray-tracing, while maximally preserves the high-fidelity rendering of the volumetric neural field. Click on the two videos below to see animated illustration of the hybrid rendering process.
FEGR decomposes the scene into multiple-passes of physics-based material properties. Assets reconstructed with FEGR are compatible with modern graphics pipelines and can be exported as 3D file formats (such as gltf and USD) and loaded into graphics engines for further editing.
FEGR can be used to reconstruct the challenging scenes from autonomous driving. The reconstructed sequences can be relighted with diverse lighting conditions, generating an abundance of training and testing data for the perception models.
Reconstructed scenes can also be populated with synthetic or AI generated objects where physics can be applied. Photorealistic object insertion into the reconstructed scenes provides a way to generate rarely observed but safety critical scenarios.
@inproceedings{wang2023fegr,
title = {Neural Fields meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes},
author = {Zian Wang and Tianchang Shen and Jun Gao and Shengyu Huang and Jacob Munkberg
and Jon Hasselgren and Zan Gojcic and Wenzheng Chen and Sanja Fidler},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023}
}
The authors appreciate the support from Janick Martinez Esturo, Evgeny Toropov, Chen Chen on the data processing pipeline, and the help from Lina Halper, Zhengyi Luo, Kelly Guo, Gavriel State on creating the edited scenes. We would also like to thank the authors of NeRF-OSR for discussion on dataset and experiment details.
|