A Comparative Analysis of Microarchitecture Effects on CPU and GPU Memory System Behavior

Publication image

While heterogeneous CPU/GPU systems have been traditionally implemented on separate chips, each with their own private DRAM, heterogeneous processors are integrating these different core types on the same die with access to a common physical memory. Further, emerging heterogeneous CPU-GPU processors promise to offer tighter coupling between core types via a unified virtual address space and cache coherence. To adequately address the potential opportunities and pitfalls that may arise from this tighter coupling, it is important to have a deep understanding of application- and memory-level demands from both CPU and GPU cores. This paper presents a detailed comparison of memory access behavior for parallel applications executing on each core type in tightly-controlled heterogeneous CPU-GPU processor simulation. This characterization indicates that applications are typically designed with similar algorithmic structures for CPU and GPU cores, and each core type’s memory access path has a similar locality filtering role. However, the different core and cache microarchitectures expose substantially different memory-level parallelism (MLP), which results in different instantaneous memory access rates and sensitivity to memory hierarchy architecture.

Authors

Joel Hestness (University of Wisconsin)
David A. Wood (University of Wisconsin)

Publication Date

Research Area

Uploaded Files