Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling

Storm-scale convection-allowing models (CAMs) are an important tool for predicting the evolution of thunderstorms and mesoscale convective systems that result in damaging extreme weather. By explicitly resolving convective dynamics within the atmosphere they afford meteorologists the nuance needed to provide outlook on hazard. Deep learning models have thus far not proven skilful at km-scale atmospheric simulation, despite being competitive at coarser resolution with state-of-the-art global, medium-range weather forecasting.

GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators

Recent advances in large language models (LLMs) have stepped forward the development of multilingual speech and machine translation by its reduced representation errors and incorporated external knowledge. However, both translation tasks typically utilize beam search decoding and top-1 hypothesis selection for inference. These techniques struggle to fully exploit the rich information in the diverse N-best hypotheses, making them less optimal for translation tasks that require a single, high-quality output sequence.

Ido Greenberg

Ido Greenberg is a Senior Research Scientist at NVIDIA's AI Research Lab at Tel-Aviv.
His research focuses on making the extraordinary achievements of the RL literature more applicable to real-world problems.

Ido completed his PhD in EE at the Technion, his MSc in Applied Math at Tel Aviv University, and his BSc in Math and Physics in The Hebrew University of Jerusalem, as part of Talpiot program.

Hasan Nazim Genc

Hasan Genc's research focuses on DNN accelerators and agile hardware design methodologies. He has built open-source hardware implementations of DNN accelerators, helped create programming languages and sparsity formats for such accelerators, and built automated tools that help others design, evaluate, and generate accelerators. He has a PhD from the University of California, Berkeley, and a Bachelor’s degree from the University of Texas at Austin.

Marina Neseem

Marina is a Research Scientist working with the Accelerators and VLSI Research Group. Her research focuses on hardware-software co-design and efficient deep learning. This includes designing efficient model architectures, implementing dynamic pruning and adaptive inference techniques, and creating memory and parameter-efficient training methods.

4D-Rotor Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic Scenes

We consider the problem of novel-view synthesis (NVS) for dynamic scenes. Recent neural approaches have accomplished exceptional NVS results for static 3D scenes, but extensions to 4D time-varying scenes remain non-trivial. Prior efforts often encode dynamics by learning a canonical space plus implicit or explicit deformation fields, which struggle in challenging scenarios like sudden movements or generating high-fidelity renderings.

Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling

We introduce Motion-I2V, a novel framework for consistent and controllable image-to-video generation (I2V). In contrast to previous methods that directly learn the complicated image-to-video mapping, Motion-I2V factorizes I2V into two stages with explicit motion modeling. For the first stage, we propose a diffusion-based motion field predictor, which focuses on deducing the trajectories of the reference image's pixels. For the second stage, we propose motion-augmented temporal attention to enhance the limited 1-D temporal attention in video latent diffusion models.

Path-space Differentiable Rendering of Implicit Surfaces

Physics-based differentiable rendering is a key ingredient for integrating forward rendering into probabilistic inference and machine learning pipelines. As a state-of-the-art formulation for differentiable rendering, differential path integrals have enabled the development of efficient Monte Carlo estimators for both interior and boundary integrals. Unfortunately, this formulation has been designed mostly for explicit geometries like polygonal meshes.

Haisor: Human-aware Indoor Scene Optimization via Deep Reinforcement Learning

3D scene synthesis facilitates and benefits many real-world applications. Most scene generators focus on making indoor scenes plausible via learning from training data and leveraging extra constraints such as adjacency and symmetry. Although the generated 3D scenes are mostly plausible with visually realistic layouts, they can be functionally unsuitable for human users to navigate and interact with furniture. Our key observation is that human activity plays a critical role and sufficient free space is essential for human-scene interactions.