ACCORD: Enabling Associativity for Gigascale DRAM Caches by Coordinating Way-Install and Way-Prediction
Stacked-DRAM technology has enabled high bandwidth gigascale DRAM caches. Since DRAM caches require a tag store of several tens of megabytes, commercial DRAM cache designs typically co-locate tag and data within the DRAM array. DRAM caches are organized as a direct-mapped structure so that the tag and data can be streamed out in a single access. While direct-mapped DRAM caches provide low hit-latency, they suffer from low hit-rate due to conflict misses. Ideally, we want the hit-rate of a set-associative DRAM cache, without incurring additional latency and bandwidth costs of increasing associativity. To address this problem, way prediction can be applied to a set-associative DRAM cache to achieve the latency and bandwidth of a direct-mapped DRAM cache. Unfortunately, conventional way prediction policies typically require per-set storage, causing multi-megabyte storage overheads for gigascale DRAM caches. If we can obtain accurate way prediction without incurring significant storage overheads, we can efficiently enable set-associativity for DRAM caches.
This paper proposes Associativity via Coordinated Way-Install and Way-Prediction (ACCORD), a design that steers an incoming line to a “preferred way” based on the line address and uses the preferred way as the default way prediction. We propose two way-steering policies that are effective for 2-way caches. First, Probabilistic Way-Steering (PWS), which steers lines to a preferred way with high probability, while still allowing lines to be installed in an alternate way in case of conflicts. Second, Ganged Way-Steering (GWS), which steers lines of a spatially contiguous region to the way where an earlier line from that region was installed. On a 2-way cache, ACCORD (PWS+GWS) obtains a way prediction accuracy of 90% and retains a hit-rate similar to a baseline 2-way cache while incurring 320 bytes of storage overhead. We extend ACCORD to support highly-associative caches using a Skewed Way-Steering (SWS) design that steers a line to at-most two ways in the highly-associative cache. This design retains the low-latency of the 2-way ACCORD while obtaining most of the hit-rate benefits of a highly associative design. Our studies with a 4GB DRAM cache backed by non-volatile memory shows that ACCORD provides an average of 11% speedup (up to 54%) across a wide range of workloads.
Publication Date
Research Area
External Links
Uploaded Files
Copyright
This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org.