
This paper presents Sparseloop, the first infrastructure that implements an analytical design space exploration methodology for sparse tensor accelerators. Sparseloop comprehends a wide set of architecture specifications including various sparse optimization features such as compressed tensor storage. Using these specifications, Sparseloop can calculate a design's energy efficiency while accounting for both optimization savings and metadata overhead at each storage and compute level of the architecture using stochastic tensor density models. We validate Sparseloop on a well-known accelerator design and achieve ~99% accuracy in terms of runtime activities (e.g., compressed memory accesses). We also present a case study that highlights the key factors (e.g., uncompressed traffic, data density) that affect sparse optimization features' impact on energy efficiency. Tool available at: https://github.com/NVlabs/timeloop.
This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org.