| Research

Michael Isaev

Read more about Michael Isaev

Evaluating and Improving Rendered Visual Experiences: Metrics, Compression, Higher Frame Rates & Recoloring

Rendered imagery is presented to us daily. Special effects in movies, video games, scientific visualizations, and marketing catalogs all often rely on images generated through computer graphics. However, with all the possibilities that rendering offers come also a plethora of challenges. This thesis proposes novel ways of evaluating the visual errors caused when some of those challenges are not completely overcome. The thesis also suggests ways to improve on the visual experience observers have when viewing rendered content.

Read more about Evaluating and Improving Rendered Visual Experiences: Metrics, Compression, Higher Frame Rates & Recoloring

Anqi Li

Anqi Li completed her Ph.D. in Computer Science & Engineering from the University of Washington in 2023. Before that, she received her Master's degree from Carnegie Mellon University and her Bachelor's degree from Zhejiang University. Her research focuses on bringing sample efficiency and formal performance guarantees to robot learning. Specific research topics include offline reinforcement learning, safe reinforcement learning, learning from human demonstrations, and planning & control with guarantees.

Read more about Anqi Li

Reena Elangovan

Read more about Reena Elangovan

It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

Recent studies have successfully shown that large language models (LLMs) can be successfully used for generative error correction (GER) on top of the automatic speech recognition (ASR) output. Specifically, an LLM is utilized to carry out a direct mapping from the N-best hypotheses list generated by an ASR system to the predicted output transcription. However, despite its effectiveness, GER introduces extra data uncertainty since the LLM is trained without taking into account acoustic information available in the speech signal.

Read more about It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

Unity ECC: Unified Memory Protection Against Bit and Chip Errors

DRAM vendors utilize On-Die Error Correction Codes (OD-ECC) to correct random bit errors internally. Meanwhile, system companies utilize Rank-Level ECC (RL-ECC) to protect data against chip errors. Separate protection increases the redundancy ratio to 32.8% in DDR5 and incurs significant performance penalties. This paper proposes a novel RL-ECC, Unity ECC, that can correct both singlechip and double-bit error patterns. Unity ECC corrects doublebit errors using unused syndromes of single-chip correction.

Read more about Unity ECC: Unified Memory Protection Against Bit and Chip Errors

Estimates of Temporal Edge Detection Filters in Human Vision

Edge detection is an important process in human visual processing. However, as far as we know, few attempts have been made to map the temporal edge detection filters in human vision. To that end, we devised a user study and collected data from which we derived estimates of human temporal edge detection filters based on three different models, including the derivative of the infinite symmetric exponential function and temporal contrast sensitivity function.

Read more about Estimates of Temporal Edge Detection Filters in Human Vision

Qianli Ma

Qianli Ma is research scientist at NVIDIA Research. He received his PhD from ETH Zürich and Max-Planck-Institute for Intelligent Systems (Tübingen), advised by Professor Michael Black and Professor Siyu Tang. He has also interned at Meta Reality Labs in Pittsburgh. He has been developing new representations and methods for reconstructing, generating and modeling digital humans. His research interests span generative models, 3D computer vision and graphics, with a current focus on dynamic 3D content generation.