A Comparative Analysis of Microarchitecture Effects on CPU and GPU Memory System Behavior
While heterogeneous CPU/GPU systems have been traditionally implemented on separate chips, each with their own private DRAM, heterogeneous processors are integrating these different core types on the same die with access to a common physical memory. Further, emerging heterogeneous CPU-GPU processors promise to offer tighter coupling between core types via a unified virtual address space and cache coherence. To adequately address the potential opportunities and pitfalls that may arise from this tighter coupling, it is important to have a deep understanding of application- and memory-level demands from both CPU and GPU cores. This paper presents a detailed comparison of memory access behavior for parallel applications executing on each core type in tightly-controlled heterogeneous CPU-GPU processor simulation. This characterization indicates that applications are typically designed with similar algorithmic structures for CPU and GPU cores, and each core type’s memory access path has a similar locality filtering role. However, the different core and cache microarchitectures expose substantially different memory-level parallelism (MLP), which results in different instantaneous memory access rates and sensitivity to memory hierarchy architecture.
Publication Date
Research Area
External Links
Uploaded Files
Copyright
This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org.