A Fine-Grained GALS SoC with Pausible Adaptive Clocking in 16 nm FinFET

Modern SoCs suffer from power supply noise that can require significant additional timing margin, reducing performance and energy efficiency. Globally asynchronous, locally synchronous (GALS) systems can mitigate the impact of power supply noise, as well as simplify system design by removing the need for global timing closure. This work presents a 4mm2 distributed accelerator engine with 19 independent clock domains implemented in a 16nm process. Local adaptive clock generators dynamically tolerate and mitigate power supply noise, resulting in a 10% improvement in performance at the same voltage compared to a globally-clocked baseline. Pausible bisynchronous FIFOs enable low-latency global communication across an on-chip network via error-free clock domain crossings. The SoC functions robustly across a wide range of voltages, frequencies, and workloads, demonstrating the practical applicability of fine-grained GALS techniques for modern SoC design.

Authors

Alicia Klinefelter (NVIDIA)
Tezaswi Raja (NVIDIA)
Kevin Zhou (NVIDIA)

Publication Date

Research Area

Award

ASYNC 2019 Best Paper Award