Toronto AI Lab
Neural Fields meet Explicit Geometric Representations (FEGR)

Neural Fields meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes

1 NVIDIA
2 University of Toronto
3 Vector Institute
4 ETH Zurich
CVPR 2023
Relighting
Object Insertion

FEGR enables Novel View Relighting and Virtual Object Insertion for a diverse range of scenes. "Neural Fields meet Explicit Geometric Representations", abbreviated as FEGR, is an approach for reconstructing scene geometry and recovering intrinsic properties of the scene from posed camera images. Our approach works both for single and multi-illumination captured data. FEGR enables various downstream applications such as VR and AR where users may want to control the lighting of the environment and insert desired 3D objects into the scene.

Abstract


Reconstruction and intrinsic decomposition of scenes from captured imagery would enable many applications such as relighting and virtual object insertion. Recent NeRF based methods achieve impressive fidelity of 3D reconstruction, but bake the lighting and shadows into the radiance field, while mesh-based methods that facilitate intrinsic decomposition through differentiable rendering have not yet scaled to the complexity and scale of outdoor scenes. We present a novel inverse rendering framework for large urban scenes capable of jointly reconstructing the scene geometry, spatially-varying materials, and HDR lighting from a set of posed RGB images with optional depth. Specifically, we use a neural field to account for the primary rays, and use an explicit mesh (reconstructed from the underlying neural field) for modeling secondary rays that produce higher-order lighting effects such as cast shadows. By faithfully disentangling complex geometry and materials from lighting effects, our method enables photorealistic relighting with specular and shadow effects on several outdoor datasets. Moreover, it supports physics-based scene manipulations such as virtual object insertion with ray-traced shadow casting.

Video (2 minutes)


Combined with other NVIDIA technology, FEGR is one component of the Neural Reconstruction Engine announced in GTC Sept 2022 Keynote.

Hybrid Rendering


Motivation. Neural radiance fields (NeRFs) have recently emerged as a powerful neural reconstruction approach that enables photo-realistic novel-view synthesis. To be compatible with modern graphics pipeline and support applications such as relighting and object insertion, recent works also build on top of neural fields and explore a full inverse rendering formulation. However, due to the volumetric nature of the neural fields, it remains an open challenge for neural fields to represent higher order lighting effects such as cast shadows via ray-tracing. In contrast to NeRF, explicit mesh representations are compatible with graphics pipeline and can effectively leverage classic graphics techniques for ray-tracing. While these methods demonstrate remarkable performance in a single-object setting, they are limited in resolution when scaling up to encompass larger scenes.

Overview. In this work, we combine the advantages of the neural fields and explicit mesh representations and propose FEGR, a new hybrid rendering pipeline for inverse rendering of large urban scenes. Specifically, we first use the neural field to perform volumetric rendering of primary rays into a G-buffer that includes the surface normal, base color, and material parameters for each pixel. We then extract the mesh from the underlying signed distance field, and perform the shading pass in which we compute illumination by integrating over the hemisphere at the shading point using Monte Carlo ray tracing.

The two-step hybrid rendering corresponds to the two passes in deferred shading, alleviating the rendering cost by leveraging mesh-based ray-tracing, while maximally preserves the high-fidelity rendering of the volumetric neural field. Click on the two videos below to see animated illustration of the hybrid rendering process.

Step 1: G-buffer with volume rendering.

Step 2: Shading with mesh proxy.

Content Digitization


FEGR decomposes the scene into multiple-passes of physics-based material properties. Assets reconstructed with FEGR are compatible with modern graphics pipelines and can be exported as 3D file formats (such as gltf and USD) and loaded into graphics engines for further editing.

Relighting


FEGR can be used to reconstruct the challenging scenes from autonomous driving. The reconstructed sequences can be relighted with diverse lighting conditions, generating an abundance of training and testing data for the perception models.

Virtual Object Insertion


Reconstructed scenes can also be populated with synthetic or AI generated objects where physics can be applied. Photorealistic object insertion into the reconstructed scenes provides a way to generate rarely observed but safety critical scenarios.

Citation


@inproceedings{wang2023fegr,
title = {Neural Fields meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes}, 
author = {Zian Wang and Tianchang Shen and Jun Gao and Shengyu Huang and Jacob Munkberg 
and Jon Hasselgren and Zan Gojcic and Wenzheng Chen and Sanja Fidler},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023}
}

Paper


Neural Fields meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes

Zian Wang, Tianchang Shen, Jun Gao, Shengyu Huang, Jacob Munkberg, Jon Hasselgren, Zan Gojcic, Wenzheng Chen, Sanja Fidler

description Paper
description Supp PDF
insert_comment BibTeX

Acknowledgment


The authors appreciate the support from Janick Martinez Esturo, Evgeny Toropov, Chen Chen on the data processing pipeline, and the help from Lina Halper, Zhengyi Luo, Kelly Guo, Gavriel State on creating the edited scenes. We would also like to thank the authors of NeRF-OSR for discussion on dataset and experiment details.