TRON: Tracing Rays to Orchestrate a Neural Renderer for 3D Gaussian Reconstructions

arXiv preprint, 2026

Or Perel^1,2,3 Hassan Abu Alhaija¹ Zian Wang^1,2,3 Jacob Munkberg¹ Matan Atzmon¹ Sanja Fidler^1,2,3 Masha Shugrina¹

¹NVIDIA ²University of Toronto ³Vector Institute

TL;DR TRON is a relightable 3D-Gaussian pipeline that pairs the reconstructed scene with a single-step neural renderer adapted from a pretrained video-diffusion model, achieving photoreal rendering quality while preserving full 3D, material, and lighting control at interactive frame rates.

A reconstructed object harmonized into a relit scene — geometry, materials, and lighting all remain user-controllable.

Method at a glance

PBR + global light on top of 3DGRT

We extend the 3D Gaussian Ray Tracing (3DGRT) pipeline with per-particle PBR material parameters and an illumination model, adding direct control over materials and lighting on top of the original geometry & radiance representation.

Diffusion priors for material decomposition

We leverage diffusion-based priors to obtain an initial material decomposition from the input views, then bake those estimates into the Gaussian scene during reconstruction.

Diffusion priors: gbuffer decomposition of a bedroom

A new role for ray tracing

Rather than chasing photorealism with more samples and longer integration, our Slang ray tracer produces compact, physically-meaningful conditions (deferred PBR + MIS irradiance) that drive a learned renderer, respecting highlights and shadows.

Single-step neural renderer

We adapt a video-diffusion backbone into a single-step architecture to reach interactive frame rates. The key idea: the rendered PBR and irradiance conditions already carry enough physical signal — the neural renderer doesn't need iterative refinement to translate them into a photoreal frame.

Single-step DiT-based neural renderer architecture

Multi-phase training

The neural renderer is trained with a three-stage curriculum — synthetic data for shadow sensitivity, followed by domain adaptation to real scenes, and finally temporal consistency via short frame-stack inputs.

Fast, controllable & photoreal

With the neural renderer enabled, the full pipeline runs interactively while retaining native 3D, material, and lighting control — opening the door to relighting, harmonization, and dynamic-shadow editing as practical, real-time applications.

Applications: relight, harmonize, dynamic shadows, material edit

Method

Relightable 3D reconstruction has long faced a hard trade-off: classical PBR-Gaussian pipelines are fast and 3D-consistent but visually limited, while recent video-diffusion relighters are photoreal but slow, expensive to run, and inconsistent over long time horizons. TRON resolves that trade-off by pairing the two: a 3D Gaussian Ray Tracing based scene with explicit materials and lighting carries the physics and the user-facing controls, and a paired single-step neural renderer closes the gap to photoreal video - fast enough to be interactive, faithful enough to compete with full diffusion models. The key insight enabling this is that the ray-traced PBR and irradiance images, while not photoreal on their own, are physically rich enough to act as conditions the neural renderer can translate into a final frame in a single step. Training that translator demands a careful curriculum, and our multi-phase strategy is what makes it work in practice. The result is a system that is simultaneously fast, controllable, and photoreal - enabling downstream edits (relighting, intensity, materials, dynamic-object harmonization) that neither classical nor diffusion-only baselines can offer together.

TRON architecture. During training, we apply an intrinsic decomposition model on all input views to get G-buffers and then (1) reconstruct a multi-view scene as a set of 2D Gaussians, and (3) extract G-buffer diffusion priors per view, which we lift and bake in 3D. At inference time, given an envmap and camera view, we (1) render a pair of buffers: PBR shaded and irradiance. (2) Then the Neural Renderer maps these pairs of buffers to a high-quality RGB output image.

From reconstruction to photoreal video

Visualizing an NVS trajectory under two different environment maps. Use the buttons to switch between channels — from raw geometry properties (G-buffers) to ray-traced PBR and irradiance conditions to the final Neural Renderer output.

SH RGB (Original) Albedo Normal Metallic Roughness PBR Shading Irradiance Neural Renderer

Input

Baked G-buffer

Intermediate Output

Final Output

Environment A

Environment B

Applications

Because the scene retains explicit Gaussian geometry and PBR materials — and the renderer is conditioned on physical signals — the same reconstruction supports a broad range of edits: relighting under novel HDR envmaps, light-intensity edits, material editing, and harmonizing dynamic objects into the scene with consistent shadows that track their motion.

Dynamic shadows. A reconstructed armadillo falls into the scene — cast shadows track the object's motion.

Harmonized dynamic NVS. The same object remains photoreal across novel views.

Albedo editing. The armadillo's albedo is re-painted to a vivid blue — geometry, shading, and contact shadows are preserved as the renderer propagates the new diffuse colour.

Albedo + metallic editing. Same geometry, deeper blue albedo with a metallic finish — specular highlights emerge as the metallic parameter is increased.

Interactive light-direction editing in our GUI. Showing a user interactively dragging a slider to change the global light azimuth direction in real time.

Comparisons to baselines

Drag the handle to swipe between TRON (left) and a baseline (right). Switch scene and baseline with the two pill rows below.

Scene:

Indoor museum Sculpture park Bedroom interior

Baseline:

UniRelight DR (Cosmos) DR (SVD) GS-IR GI-GS GaussianShader

◀▶

TRON (Ours)

UniRelight

Flicker test — multi-seed consistency

Each video below shows the same camera position rendered with 16 different diffusion seeds, played in sequence. TRON conditions on physically-grounded ray-traced buffers and stays stable across seeds; the video-diffusion baselines flicker between seeds because they have no 3D anchor.

TRON (Ours)

UniRelight

DiffusionRenderer (Cosmos)

DiffusionRenderer (SVD)

Related work

Citation

@article{perel2026tron,
  title   = {{TRON}: Tracing Rays to Orchestrate a Neural Renderer for {3D} Gaussian Reconstructions},
  author  = {Perel, Or and Abu Alhaija, Hassan and Wang, Zian and Munkberg, Jacob and Atzmon, Matan and Fidler, Sanja and Shugrina, Masha},
  journal = {arXiv preprint arXiv:2606.11314},
  year    = {2026}
}

Acknowledgements

We thank Nicolas Moenne-Loccoz, Yuxuan (Alex) Zhang, Miloš Hašan, Zheng Zeng, Jon Hasselgren, Vismay Modi, Sai Bangaru, Zan Gojcic, and Nicholas Sharp for valuable discussions and for their support with the prior work this paper builds upon.