UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation

Pre-training and representation learning have been playing an increasingly important role in modern speech processing. Nevertheless, different applications have been relying on different foundation models, since predominant pre-training techniques are either designed for discriminative tasks or generative tasks. In this work, we make the first attempt at building a unified pre-training framework for both types of tasks in speech.

Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models

We propose the use of latent space generative world models to address the covariate shift problem in autonomous driving. A world model is a neural network capable of predicting an agent's next state given past states and actions. By leveraging a world model during training, the driving policy effectively mitigates covariate shift without requiring an excessive amount of training data.

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

We propose a novel hybrid Mamba-Transformer backbone, MambaVision, specifically tailored for vision applications. Our core contribution includes redesigning the Mamba formulation to enhance its capability for efficient modeling of visual features. Through a comprehensive ablation study, we demonstrate the feasibility of integrating Vision Transformers (ViT) with Mamba. Our results show that equipping the Mamba architecture with self-attention blocks in the final layers greatly improves its capacity to capture long-range spatial dependencies.

Marco: Configurable Graph-Based Task Solving and Multi-AI Agents Framework for Hardware Design

Hardware design presents numerous challenges stemming from its complexity and advancing technologies. These challenges result in longer turn-around-time (TAT) for optimizing performance, power, area, and cost (PPAC) during synthesis, verification, physical design, and reliability loops. Large Language Models (LLMs) have shown remarkable capacity to comprehend and generate natural language at a massive scale, leading to many potential applications and benefits across various domains.

Alán Aspuru-Guzik

Senior Director of Quantum Chemistry.

 aspuru@nvidia.com  Toronto, Canada

My research at NVIDIA focuses at the intersection of Quantum Computing, Artificial Intelligence and Chemical applications. 

-- Near-term quantum algorithm development

-- Algorithms for fast quantum chemistry simulation for quantum computers and classical computers

-- Self-driving laboratories and robotics for chemical automation

-- Chemical generative models for materials design and drug discovery

Experimenting with Artificial Intelligence: Programming Pathfinding Algorithms in C++ with Unreal Engine 5

Recent years have seen a rise in machine learning applications deployed in different technological contexts. It has shaped how we experience technology and led to new doors of innovation and a mass market interest in artificial intelligence (AI). While academic research pushes the boundaries of AI through novel approaches, games serve as important testbeds for virtual agents. Game AI focuses on human psychology by using and looking for solutions to create fun illusions of intelligence.

Studying Esports Competition: Piloting Methodology for User Studies During Tournaments

Designing experiments is a well-studied, complex task with many conflicting constraints. In experimental user studies, the overarching goal is to gain understanding of how a system affects users by manipulating independent variables (eg, an interface's configuration) while measuring dependent variables (eg, the user's performance).

Student-T and Beyond: Practical Tools for Multiple-Scattering BSDFs with General NDFs

We present a practical importance-sampling scheme for the Student-T distribution of visible normals by representing the Student-T NDF as a superposition of Beckmann NDFs. Additionally, we derive a new form of delta tracking to evaluate and sample exact BSDFs with general full-sphere NDFs. These tools permit efficient computation of benchmark BSDF values for the multiple scattering from general (including porous) rough surfaces.