1. [Publications](/publications)
2. Selective GPU Caches to Eliminate CPU-GPU HW Cache Coherence
 
 # Selective GPU Caches to Eliminate CPU-GPU HW Cache Coherence

  ![Publication image](/sites/default/files/styles/wide/public/default_images/default.jpeg?itok=qUFsuJCP "Publication image")

 Cache coherence is ubiquitous in shared memory multiprocessors because it provides a simple, high performance memory abstraction to programmers. Recent work suggests extending hardware cache coherence between CPUs and GPUs to help support programming models with tightly coordinated sharing between CPU and GPU threads. However, implementing hardware cache coherence is particularly challenging in systems with discrete CPUs and GPUs that may not be produced by a single vendor. Instead, we propose, selective caching, wherein we disallow GPU caching of any memory that would require coherence updates to propagate between the CPU and GPU, thereby decoupling the GPU from vendor-specific CPU coherence protocols. We propose several architectural improvements to offset the performance penalty of selective caching: aggressive request coalescing, CPU-side coherent caching for GPU-uncacheable requests, and a CPU-GPU interconnect optimization to support variable-size transfers. Moreover, current GPU workloads access many read-only memory pages; we exploit this property to allow promiscuous GPU caching of these pages, relying on page-level protection, rather than hardware cache coherence, to ensure correctness. These optimizations bring a selective caching GPU implementation to within 93% of a hardware cache-coherent implementation without the need to integrate CPUs and GPUs under a single hardware coherence protocol.



 ## Authors



Neha Agarwal (University of Michigan)

[David Nellans](/person/david-nellans)

Eiman Ebrahimi (NVIDIA)

Thomas F. Wenisch (University of Michigan)

John Danskin (NVIDIA)

[Steve Keckler](/person/stephen-keckler)

 

 

 ## Publication Date



Saturday, March 12, 2016

 

 ## Published in



[ International Symposium on High Performance Computer Architecture (HPCA)](https://ieeexplore.ieee.org/document/7446089)

 

 ## Research Area



[Computer Architecture](/research-area/computer-architecture)

[High Performance Computing](/research-area/high-performance-computing)

 

 

 ## External Links



[IEEE Digital Library](https://ieeexplore.ieee.org/document/7446089)

 

 

 ## Uploaded Files



[Published Manuscript](https://d1qx31qr3h6wln.cloudfront.net/publications/HPCA_2016_Coherence.pdf "Open file in new window")1.4 MB

 

 

 ## Copyright



This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to <pubs-permissions@ieee.org>.