Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Skip to main content
Artificial Intelligence Computing Leadership from NVIDIA
Login
Research Labs
All Research Labs
3D Deep Learning
Applied Research
Autonomous Vehicles
Deep Imagination
Publications
AI Playground
New and Featured
AI Art Gallery
NGC Demos
Research Areas
AI & Machine Learning
3D Deep Learning
Computer Vision
Robotics
All Areas
Careers
Academic Collaborations
Government Collaborations
Graduate Fellowship
Internships
Research Openings
Research Scientists
Meet the Team
Licensing
Search
Search
Enter the terms you wish to search for.
Publications
Our publications provide insight into some of our leading-edge research.
Filters
Search
Apply
Filters
Filters
Publication Year
2025
(4)
2024
(18)
2023
(46)
2022
(52)
2021
(31)
2020
(48)
2019
(37)
2018
(43)
2017
(19)
2016
(6)
2015
(7)
2014
(2)
2013
(3)
2012
(3)
2011
(1)
2010
(1)
Facet Publication Year
Research Areas
Computer Vision
(321)
Artificial Intelligence and Machine Learning
(180)
Robotics
(54)
Generative AI
(47)
Computer Graphics
(42)
Computational Photography and Imaging
(19)
Autonomous Vehicles
(18)
Human Computer Interaction
(13)
VR, AR and Display Technology
(13)
Applied Perception
(11)
Medical
(6)
Real-Time Rendering
(6)
Resilience and Safety
(6)
Hyperscale Graphics
(5)
Natural Language Processing
(5)
High Performance Computing
(3)
Algorithms and Numerical Methods
(2)
Esports
(2)
Computer Architecture
(1)
Speech Processing
(1)
Events
CORL
(6)
CVPR
(41)
ECCV
(7)
ICCV
(7)
ICLR
(8)
ICML
(2)
ICRA
(16)
IROS
(7)
NeurIPS
(16)
RSS
(3)
SIGGRAPH
(7)
321 results found
Computer Vision
Clear all
Computer Vision
2025
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Ali Hatamizadeh
,
Jan Kautz
CVPR
Spatio-Temporal Context Prompting for Zero-Shot Action Detection
Wei-Jhe Huang,
Min-Hung Chen
, Shang-Hong Lai
Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation
Ci-Siang Lin, Chien-Yi Wang,
Frank Wang
,
Min-Hung Chen
CorrFill: Enhancing Faithfulness in Reference-based Inpainting with Correspondence Guidance in Diffusion Models
Kuan-Hung Liu, Cheng-Kun Yang,
Min-Hung Chen
, Yu-Lun Liu, Yen-Yu Lin
2024
Fast Encoder-Based 3D from Casual Videos via Point Track Processing
Yoni Kasten
, Wuyue Lu,
Haggai Maron
NeurIPS
Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities
Siyin Wang,
Huck Yang
, Ji Wu, Chao Zhang
From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment
Yusuke Hirota,
Ryo Hachiuma
,
Huck Yang
, Yuta Nakashima
Proto-CLIP: Vision-Language Prototypical Network for Few-Shot Learning
Jishnu Jaykumar P, Kamalesh Palanisamy,
Yu-Wei Chao
, Xinya Du, Yu Xiang
IROS
Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models
Alexander Popov, Alperen Degirmenci, David Wehr, Shashank Hegde , Ryan Oldja, Alexey Kamenev, Bertrand Douillard, David Nistér, Urs Muller, Ruchi Bhargava,
Stan Birchfield
, Nikolai Smolyanskiy
TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models
Gilad Deutch,
Rinon Gal
, Daniel Garibi, Or Patashnik, Daniel Cohen-Or
SIGGRAPH
DoRA: Weight-Decomposed Low-Rank Adaptation
Shih-Yang Liu, Chien-Yi Wang,
Hongxu Danny Yin
,
Pavlo Molchanov
,
Frank Wang
, Kwang-Ting Cheng,
Min-Hung Chen
ICML
RVT-2: Learning Precise Manipulation from Few Examples
Ankit Goyal
,
Valts Blukis
,
Jie Xu
,
Yijie Guo
,
Yu-Wei Chao
,
Dieter Fox
RSS
Breathing Life Into Sketches Using Text-to-Video Priors
Rinon Gal
, Yael Vinker, Yuval Alaluf, Amit Bermano, Daniel Cohen-Or, Ariel Shamir,
Gal Chechik
CVPR
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models
Huan Ling, Seung Wook Kim, Antonio Torralba, Sanja Fidler,
Karsten Kreis
CVPR
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
Bowen Wen
,
Wei Yang
,
Jan Kautz
,
Stan Birchfield
CVPR
NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows
Zhenggang Tang, Zhongzheng Ren, Xiaoming Zhao,
Bowen Wen
,
Jonathan Tremblay
,
Stan Birchfield
, Alexander Schwing
CVPR
Neural Implicit Representation for Building Digital Twins of Unknown Articulated Objects
Yijia Weng,
Bowen Wen
,
Jonathan Tremblay
,
Valts Blukis
,
Dieter Fox
, Leo Guibas,
Stan Birchfield
CVPR
SynH2R: Synthesizing Hand-Object Motions for Learning Human-to-Robot Handovers
Sammy Christen, Lan Feng,
Wei Yang
,
Yu-Wei Chao
, Otmar Hilliges, Jie Song
ICRA
FasterViT: Fast Vision Transformers with Hierarchical Attention
Ali Hatamizadeh
,
Greg Heinrich
,
Hongxu Danny Yin
, Andrew Tao, Jose M. Alvarez,
Jan Kautz
,
Pavlo Molchanov
ICLR
WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space
Katja Schwarz, Seung Wook Kim, Jun Gao, Sanja Fidler, Andreas Geiger,
Karsten Kreis
ICLR
LCM-Lookahead for Encoder-based Text-to-Image Personalization
Rinon Gal
, Or Lichter, Elad Richardson, Or Patashnik, Amit H Bermano,
Gal Chechik
, Daniel Cohen-Or
ECCV
Consolidating Attention Features for Multi-view Image Editing
Or Patashnik,
Rinon Gal
, Daniel Cohen-Or, Jun-Yan Zhu, Fernando De la Torre
SIGGRAPH
2023
Point-Cloud Completion with Pretrained Text-to-image Diffusion Models
Yoni Kasten
, Ohad Rahamim,
Gal Chechik
NeurIPS
Generalizable One-shot 3D Neural Head Avatar
Xueting Li
,
Shalini De Mello
,
Sifei Liu
,
Koki Nagano
,
Umar Iqbal
,
Jan Kautz
NeurIPS
SceneScape: Text-Driven Consistent Scene Generation
Rafail Fridman, Amit Abecasis,
Yoni Kasten
, Tali Dekel
NeurIPS
XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies
Xuanchi Ren, Jiahui Huang, Xiaohui Zeng, Ken Museth, Sanja Fidler, Francis Williams
CVPR
Neural LiDAR Fields for Novel View Synthesis
Shengyu Huang, Zan Gojcic, Zian Wang, Francis Williams,
Yoni Kasten
, Sanja Fidler, Konrad Schindler, Or Litany
ICCV
DreamTeacher: Pretraining Image Backbones with Deep Generative Models
Daiqing Li, Huan Ling, Amlan Kar, David Acuna, Seung Wook Kim,
Karsten Kreis
, Antonio Torralba, Sanja Fidler
ICCV
2D-3D Interlaced Transformer for Point Cloud Segmentation with Scene-Level Supervision
Cheng-Kun Yang,
Min-Hung Chen
, Yung-Yu Chaung, Yen-Yu Lin
ICCV
ATT3D: Amortized Text-To-3D Object Synthesis
Jonathan Lorraine, Kevin Xie, Xiaohui Zeng,
Chen-Hsuan Lin
, Towaki Takikawa, Nicholas Sharp,
Tsung-Yi Lin
,
Ming-Yu Liu
, Sanja Fidler, James Lucas
ICCV
HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose Annotations, Affordances, and Reconstructions
Andrew Guo,
Bowen Wen
, Jianhe Yuan,
Jonathan Tremblay
,
Stephen Tyree
,
Jeff Smith
,
Stan Birchfield
IROS
Syntactic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment
Royi Rassin, Eran Hirsch, Daniel Glickman, Shauli Ravfogel, Yoav Goldberg,
Gal Chechik
NeurIPS
Oral presentation
Pagination
Current page
1
Page
2
Page
3
Page
4
Page
5
Page
6
Page
7
Page
8
Page
9
…
Next page
Next ›
Last page
Last »