Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(20)
2024
(37)
2023
(39)
2022
(13)
2021
(7)
2020
(6)
2019
(1)
2018
(3)
Facet Publication Year
Research Areas
Generative AI
(59)
Artificial Intelligence and Machine Learning
(49)
Computer Vision
(22)
Computer Graphics
(12)
Robotics
(8)
Autonomous Vehicles
(7)
Circuits and VLSI Design
(5)
Applied Perception
(4)
Physical AI
(4)
Speech Processing
(4)
Natural Language Processing
(3)
VR, AR and Display Technology
(3)
High Performance Computing
(2)
Algorithms and Numerical Methods
(1)
Climate Simulation
(1)
Computational Photography and Imaging
(1)
Computer Architecture
(1)
Storage and Systems
(1)
Events
CVPR
(6)
ICCV
(4)
ICLR
(7)
ICML
(2)
ICRA
(6)
NeurIPS
(5)
RSS
(1)
SIGGRAPH
(8)
59 results found
Generative AI
Clear all
2025
2023
Generative AI
2025
VoiceNoNG: Robust High-Quality Speech Editing Model without Hallucinations
Sung-Feng Huang
, Heng-Cheng Kuo, Zhehuai Chen, Xuesong Yang, Pin-Jui Ku, Ante Jukić,
Huck Yang
, Yu Tsao,
Frank Wang
, Hung-yi Lee,
Szu-Wei Fu
Assessing Learned Models for Phase-only Hologram Compression
Zicong Peng, Yicheng Zhan,
Josef Spjut
, Kaan Akşit
SIGGRAPH
FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
Boris Bonev
, Thorsten Kurth, Ankur Mahesh, Mauro Bisson,
Jean Kossaifi
, Karthik Kashinath, Anima Anandkumar, William D. Collins,
Mike Pritchard
,
Alex Keller
Beyond the Buzz: A Pragmatic Take on Inference Disaggregation
Tiyasa Mitra, Ritika Borkar, Nidhi Bhatia, Ramon Matas, Shivam Raj, Dheevatsa Mudigere, Ritchie Zhao, Maximilian Golub, Arpan Dutta, Sailaja Madduri, Dharmesh Jani, Brian Pharris, Bita Darvish Rouhani
Inference-Time Policy Steering through Human Interactions
Yanwei Wang, Lirui Wang, Yilun Du,
Balakumar Sundaralingam
,
Xuning Yang
,
Yu-Wei Chao
,
Claudia Pérez D’Arpino
, Dieter Fox, Julie Shah
ICRA
Score Distillation Sampling for Audio: Source Separation, Synthesis, and Beyond
Jessie Richter-Powell, Antonio Torralba, Jonathan Lorraine
ICML
Fugatto 1 - Foundational Generative Audio Transformer Opus 1
Rafael Valle, Rohan Badlani, Zhifeng Kong, Sang-gil Lee, Arushi Goel, Sungwon Kim, Joao Felipe Santos, Shuqi Dai,
Siddharth Gururani
, Aya AIJa'fari, Alex Liu, Kevin Shih, Wei Ping,
Huck Yang
, Bryan Catanzaro
ICLR
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
Alexander H. Liu, Sang-gil Lee,
Huck Yang
, Yuan Gong,
Frank Wang
, James R. Glas, Rafael Valle
ICLR
Cosmos Transfer 1: World-to-World Transfer with Adaptive Multi-Control for Physical AI
Ming-Yu Liu
Cosmos-Reason 1: From Physical AI Common Sense to Embodied Decisions
Tsung-Yi Lin
,
Ming-Yu Liu
NVIDIA Isaac GR00T N1: An Open Foundation Model for Humanoid Robots
Yuke Zhu
,
Linxi "Jim" Fan
, NVIDIA GEAR Team
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models
Zhengyi Wang, Jonathan Lorraine, Yikai Wang, Hang Su, Jun Zhu, Sanja Fidler,
Xiaohui Zeng
Multi-student Diffusion Distillation for Better One-step Generators
Yanke Song, Jonathan Lorraine,
Weili Nie
,
Karsten Kreis
, James Lucas
ICML
CorrFill: Enhancing Faithfulness in Reference-based Inpainting with Correspondence Guidance in Diffusion Models
Kuan-Hung Liu, Cheng-Kun Yang,
Min-Hung Chen
, Yu-Lun Liu, Yen-Yu Lin
Energy-Based Diffusion Language Models for Text Generation
Minkai Xu,
Tomas Geffner
,
Karsten Kreis
,
Weili Nie
, Yilun Xu, Jure Leskovec, Stefano Ermon,
Arash Vahdat
ICLR
Truncated Consistency Models
Sangyun Lee, Yilun Xu,
Tomas Geffner
, Giulia Fanti,
Karsten Kreis
,
Arash Vahdat
,
Weili Nie
ICLR
Proteina: Scaling Flow-based Protein Structure Generative Models
Tomas Geffner
,
Kieran Didi
, Zuobai Zhang, Danny Reidenbach, Zhonglin Cao, Jason Yim, Mario Geiger, Christian Dallago, Emine Kucukbenli,
Arash Vahdat
,
Karsten Kreis
ICLR
ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids
Hannes Stark, Bowen Jing,
Tomas Geffner
, Jason Yim, Tommi Jaakkola,
Arash Vahdat
,
Karsten Kreis
ICLR
Directed Graph Generation with Heat Kernels
Marc T. Law,
Karsten Kreis
,
Haggai Maron
Cosmos World Foundation Model Platform for Physical AI
Ming-Yu Liu
, Many other contributors at https://d1qx31qr3h6wln.cloudfront.net/publications/NVIDIA%20Cosmos_4.pdf
2023
Point-Cloud Completion with Pretrained Text-to-image Diffusion Models
Yoni Kasten
, Ohad Rahamim,
Gal Chechik
NeurIPS
SceneScape: Text-Driven Consistent Scene Generation
Rafail Fridman, Amit Abecasis,
Yoni Kasten
, Tali Dekel
NeurIPS
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Chen Chen, YuChen Hu,
Huck Yang
, Sabato Marco Siniscalchi, Pin-Yu Chen, Ensiong Chng
NeurIPS
Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models
Moab Arar,
Rinon Gal
,
Yuval Atzmon
,
Gal Chechik
, Daniel Cohen-Or, Ariel Shamir, Amit Bermano
SIGGRAPH
XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies
Xuanchi Ren, Jiahui Huang, Xiaohui Zeng, Ken Museth, Sanja Fidler, Francis Williams
CVPR
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
Srijith Radhakrishnan,
Huck Yang
, Sumeer Khan, Rohit Kumar, Narsis Kiani, David Gomez-Cabrero, Jesper Tegnér
ChipNeMo: Domain-Adapted LLMs for Chip Design
Mingjie Liu
, Teo Ene, Robert Kirby, Chris Cheng,
Nathaniel Pinckney
,
Rongjian Liang
, Jonah Alben, Himyanshu Anand, Sanmitra Banerjee, Ismet Bayraktaroglu, Bonita Bhaskaran, Bryan Catanzaro, Arjun Chaudhuri, Sharon Clay, Bill Dally, Laura Dang, Parikshit Deshpande, Siddhanth Dhodhi, Sameer Halepete, Eric Hill, Jiashang Hu, Sumit Jain,
Brucek Khailany
, George Kokai, Kishor Kunal, Xiaowei Li, Charley Lind, Hao Liu, Stuart Oberman, Sujeet Omar, Sreedhar Pratty, Jonathan Raman, Ambar Sarkar, Zhengjiang Shao, Hanfei Sun, Pratik P Suthar, Varun Tej,
Walker Turner
, Kaizhe Xu,
Haoxing (Mark) Ren
TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models
Tianshi Cao,
Karsten Kreis
, Sanja Fidler, Nicholas Sharp, Kangxue Yin
ICCV
Generative Novel View Synthesis with 3D-Aware Diffusion Models
Eric R. Chan,
Koki Nagano
, Matthew Chan, Alexander W. Bergman, Jeong Joon Park, Axel Levy,
Miika Aittala
,
Shalini De Mello
,
Tero Karras
, Gordon Wetzstein
ICCV
Oral
DreamTeacher: Pretraining Image Backbones with Deep Generative Models
Daiqing Li, Huan Ling, Amlan Kar, David Acuna, Seung Wook Kim,
Karsten Kreis
, Antonio Torralba, Sanja Fidler
ICCV
ATT3D: Amortized Text-To-3D Object Synthesis
Jonathan Lorraine, Kevin Xie, Xiaohui Zeng,
Chen-Hsuan Lin
, Towaki Takikawa, Nicholas Sharp,
Tsung-Yi Lin
,
Ming-Yu Liu
, Sanja Fidler, James Lucas
ICCV
Syntactic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment
Royi Rassin, Eran Hirsch, Daniel Glickman, Shauli Ravfogel, Yoav Goldberg,
Gal Chechik
NeurIPS
Oral presentation
Pagination
Current page
1
Page
2
Next page
Next ›
Last page
Last »