1. [Publications](/publications)
2. HMG: Extending Cache Coherence Protocols Across Modern Hierarchical Multi-GPU Systems
 
 # HMG: Extending Cache Coherence Protocols Across Modern Hierarchical Multi-GPU Systems

  ![Publication image](/sites/default/files/styles/wide/public/default_images/default.jpeg?itok=qUFsuJCP "Publication image")

 Prior work on GPU cache coherence has shown that simple hardware- or software-based protocols can be more than sufficient. However, in recent years, features such as multi-chip modules have added deeper hierarchy and non-uniformity into GPU memory systems. GPU programming models have chosen to expose this non-uniformity directly to the end user through scoped memory consistency models. As a result, there is room to improve upon earlier coherence protocols that were designed only for flat single-GPU hierarchies and/or simpler memory consistency models.

In this paper, we propose HMG, a cache coherence protocol designed for forward-looking multi-GPU systems. HMG strikes a balance between simplicity and performance: it uses a readily implementable VI-like protocol to track coherence states, but it tracks sharers using a hierarchical scheme optimized for mitigating the bandwidth limitations of inter-GPU links. HMG leverages the novel scoped, non-multi-copy-atomic properties of modern GPU memory models, and it avoids the overheads of invalidation acknowledgments and transient states that were needed to support prior GPU memory models. On a 4-GPU system, HMG improves performance over a software controlled, bulk invalidation-based coherence mechanism by 26% and over a non-hierarchical hardware cache coherence protocol by 18%, thereby achieving 97% of the performance of an idealized caching system.



 ## Authors



Xiaowei Ren (University of British Columbia)

[Daniel Lustig](/person/daniel-lustig)

Evgeny Bolotin (NVIDIA)

[Aamer Jaleel](/person/aamer-jaleel)

Oreste Villa (NVIDIA)

[David Nellans](/person/david-nellans)

 

 

 ## Publication Date



Saturday, February 22, 2020

 

 ## Published in



[International Symposium on High Performance Computer Architecture (HPCA)](https://ieeexplore.ieee.org/document/9065597)

 

 ## Research Area



[Computer Architecture](/research-area/computer-architecture)

 

 

 ## External Links



[IEEE Digital Library](https://ieeexplore.ieee.org/document/9065597)

 

 

 ## Uploaded Files



[Published Manuscript](https://d1qx31qr3h6wln.cloudfront.net/publications/HierarchicalMultiGPUCoherence.pdf "Open file in new window")762.87 KB

 

 

 ## Copyright



This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to <pubs-permissions@ieee.org>.