Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training

Data clipping is crucial in reducing noise in quantization operations and improving the achievable accuracy of quantization-aware training (QAT). Current practices rely on heuristics to set clipping threshold scalars and cannot be shown to be optimal. We propose Optimally Clipped Tensors And Vectors (OCTAV), a recursive algorithm to determine MSE-optimal clipping scalars. Derived from the fast Newton-Raphson method, OCTAV finds optimal clipping scalars on the fly, for every tensor, at every iteration of the QAT routine.

Jack Snyder

Jack is a research scientist in the networking research group. He finished his Ph.D.in 2022 at Duke University where his advisor was Alvin R. Lebeck. His dissertation focused on congestion control mechanisms and protocols for lossless networks. At Duke, he won the outstanding teaching award. He received his B.S. in computer science and mathematics from Rhodes College where he worked with Brian Larkins on parallel programming models. His research interests include HPC networking and hardware/software codesign for distributed systems. At Nvidia, Jack works on congestion control.

Sana Damani

Sana Damani joined NVIDIA Research in 2022 as a member of the Architecture Research Group. Her primary focus areas include compiler optimizations and hardware-software co-design. Before joining NVIDIA, she earned her PhD from the Georgia Institute of Technology, where her dissertation focused on optimized scheduling and allocation techniques for parallel architectures. She is also a recipient of the 2021 NVIDIA Graduate Fellowship.

Melih Elibol

Melih Elibol is a Senior Research Scientist in Programming Systems and Applications research at NVIDIA. His research aims to improve the ease of expressing scalable high performance programs using modern programming tools, as well as considering how numerical optimization and machine learning may be applied toward addressing emerging challenges in this space. He completed his Ph.D. at the University of California, Berkeley and A.L.B. at Harvard University.

Driving Down Link Energy and Driving Up Link Density in GPU Networks

GPU-accelerated computing systems, which power the AI revolution, rely on increasing amounts of off-chip I/O. To continue scaling, very dense integration of ultra-efficient optical transceivers alongside next-generation processor die will be needed.

Tizian Zeltner

I'm a Senior Research Scientist at NVIDIA interested in appearance modeling, light transport algorithms, and differentiable physically based rendering.

Ravi Ramamoorthi

Ravi Ramamoorthi is the Ronald L.

Transform2Act: Learning a Transform-and-Control Policy for Efficient Agent Design

An agent's functionality is largely determined by its design, i.e., skeletal structure and joint attributes (e.g., length, size, strength). However, finding the optimal agent design for a given function is extremely challenging since the problem is inherently combinatorial and the design space is prohibitively large. Additionally, it can be costly to evaluate each candidate design which requires solving for its optimal controller. To tackle these problems, our key idea is to incorporate the design procedure of an agent into its decision-making process.