Modeling and Analysis of Power Supply Noise Tolerance with Fine-grained GALS Adaptive Clocks

Power supply noise can significantly degrade circuit performance in modern high-performance SoCs. Adaptive clocking schemes have been proposed recently that can tolerate power supply noise by adjusting the clock frequency in response to fast-changing voltage variations. In this paper, we model and quantify power supply noise tolerance with a fine-grained globally asynchronous locally synchronous (GALS) design style together with an adaptive clocking scheme.

A Pausible Bisynchronous FIFO for GALS Systems

Many of the challenges of modern SoC design can be mitigated or eliminated with globally asynchronous, locally synchronous (GALS) design techniques. Partitioning a design into many synchronous islands introduces myriad asynchronous boundary crossings which typically incur high latency. We have designed a pausible bisynchronous FIFO that achieves low inter- face latency with a pausible clocking scheme.

Towards Selecting Robust Hand Gestures for Automotive Interfaces

Driver distraction is a serious threat to automotive safety. The visual-manual interfaces in cars are a source of distraction for drivers. Automotive touch-less hand gesture-based user interfaces can help to reduce driver distraction and enhance safety and comfort. The choice of hand gestures in automotive interfaces is central to their success and widespread adoption. In this work we evaluate the recognition accuracy of 25 different gestures for state-of-the-art computer vision-based gesture recognition algorithms and for human observers.

Filtering Distributions of Normals for Shading Antialiasing

High-frequency illumination effects, such as highly glossy highlights on curved surfaces, are challenging to render in a stable manner. Such features can be much smaller than the area of a pixel and carry a high amount of energy due to high reflectance.

Estimating Local Beckmann Roughness for Complex BSDFs

Many light transport related techniques require an analysis of the blur width of light scattering at a path vertex, for instance a Beck-mann roughness.

Single-pass Parallel Prefix Scan with Decoupled Look-back

We describe a work-efficient, communication-avoiding, single-pass method for the parallel computation of prefix scan. When consuming input from memory, our algorithm requires only ~2n data movement: n inputs are read, n outputs are written. Our method embodies a decoupled look-back strategy that performs redundant work to dissociate local computation from the latencies of global prefix propagation. Implemented by the CUB library of parallel primitives for GPU architectures, the performance throughput of our parallel prefix scan approaches that of copy operations.

Stack-Based Algorithms for HDR Capture and Reconstruction

High-dynamic-range (HDR) images can be created with standard camera hardware by capturing and combining multiple pictures, each sampling a different segment of the irradiance distribution of a scene. This seemingly straightforward process involves several important steps, which will be the focus of this chapter. We start by examining the problem of selecting the set of exposures that properly measures the full dynamic range of a particular scene, a process known as metering for HDR.

An Analytical Model for Hardened Latch Selection and Exploration

Hardened flip-flops and latches are designed to be resilient to soft errors, maintaining high system reliability in the presence of energetic radiation. The wealth of different hardened designs (with varying protection levels) and the probabilistic nature of reliability complicates the choice of which hardened storage element to substitute where. This paper develops an analytical model for hardened latch and flip-flop design space exploration. It is shown that the best hardened design depends strongly on the target protection level and the chip that is being protected.

All-Inclusive ECC: Thorough End-to-End Protection for Reliable Computer Memory

Increasing transfer rates and decreasing I/O voltage levels make signals more vulnerable to transmission errors. While the data in computer memory are well-protected by modern error checking and correcting (ECC) codes, the clock, control, command, and address (CCCA) signals are weakly protected or even unprotected such that transmission errors leave serious gaps in data-only protection. This paper presents All-Inclusive ECC (AIECC), a memory protection scheme that leverages and augments data ECC to also thoroughly protect CCCA signals.

S-Step and Communication-Avoiding Iterative Methods

In this paper we make an overview of s-step Conjugate Gradient (CG) and develop a novel formulation for s-step BiConjugate Gradient Stabilized (BiCGStab) iterative method. Also, we show how to add preconditioning to both of these s-step schemes. We explain their relationship to the standard, block and communication-avoiding counterparts.