DiGamma: Domain-aware Genetic Algorithm for HW-Mapping Co-optimization for DNN Accelerators

The design of DNN accelerators includes two key parts: HW resource configuration and mapping strategy. Intensive research has been conducted to optimize each of them independently. Unfortunately, optimizing for both together is extremely challenging due to the extremely large cross-coupled search space. To address this, in this paper, we propose a HW-Mapping cooptimization framework, an efficient encoding of the immense design space constructed by HW and Mapping, and a domainaware genetic algorithm, named DiGamma, with specialized operators for improving search efficiency.

Self Adaptive Reconfigurable Arrays (SARA): Learning Flexible GEMM Accelerator Configuration and Mapping-space using ML

This work demonstrates a scalable reconfigurable accelerator (RA) architecture designed to extract maximum performance and energy efficiency for GEMM workloads. We also present a self-adaptive (SA) unit, which runs a learnt model for one-shot configuration optimization in hardware offloading the software stack thus easing the deployment of the proposed design.

Suraksha: A Quantitative AV Safety Evaluation Framework to Analyze Safety Implications of Perception Design Choices

This paper proposes an automated AV safety evaluation framework, Suraksha that quantifies and analyzes the sensitivities of different design parameters on AV safety. It employs a set of driving scenarios generated based on a user-specified difficulty level. It enables the exploration of tradeoffs in requirements either in existing AV implementations to find opportunities for improvement or during the development process to explore the component-level requirements for an optimal and safe AV architecture.

Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles

Extracting interesting scenarios from real world data as well as generating failure cases is important for the development and testing of autonomous systems. We propose efficient mechanisms to both characterize and generate testing scenarios using a state-of-the-art driving simulator. For any scenario, our method generates a set of possible driving paths and identifies all the possible safe driving trajectories that can be taken starting at different times, to compute metrics that quantify the complexity of the scenario.

Flexion: A Quantitative Metric for Flexibility in DNN Accelerators

Dataflow and tile size choices, which we collectively refer to as mappings, dictate the efficiency (i.e., latency and energy) of DNN accelerators. Rapidly evolving DNN models is one of the major challenges for DNN accelerators since the optimal mapping heavily depends on the layer shape and size. To maintain high efficiency across multiple DNN models, flexible accelerators that can support multiple mappings have emerged. However, we currently lack a metric to evaluate accelerator flexibility and quantitatively compare their capability to run different mappings.

Safecracker: Leaking Secrets through Compressed Caches

The hardware security crisis brought on by recent speculative execution attacks has shown that it is crucial to adopt a security-conscious approach to architecture research, analyzing the security of promising architectural techniques before they are deployed in hardware. This paper offers the first security analysis of cache compression, one such promising technique that is likely to appear in future processors. We find that cache compression is insecure because the compressibility of a cache line reveals information about its contents.

Heterogeneous Dataflow Accelerators for Multi-DNN Workloads

AI-enabled applications such as augmented and virtual reality (AR/VR) leverage multiple deep neural network (DNN) models for various sub-tasks such as object detection, image segmentation, eye-tracking, speech recognition, and so on. Because of the diversity of the sub-tasks, the layers within and across the DNN models are highly heterogeneous in operation and shape.

Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators

This paper presents Sparseloop, the first infrastructure that implements an analytical design space exploration methodology for sparse tensor accelerators. Sparseloop comprehends a wide set of architecture specifications including various sparse optimization features such as compressed tensor storage. Using these specifications, Sparseloop can calculate a design's energy efficiency while accounting for both optimization savings and metadata overhead at each storage and compute level of the architecture using stochastic tensor density models.

MAESTRO: A Data-Centric Approach to Understand Reuse, Performance, and Hardware Cost of DNN Mappings

The efficiency of an accelerator depends on three factors -- mapping, deep neural network (DNN) layers, and hardware -- constructing extremely complicated design space of DNN accelerators. To demystify such complicated design space and guide the DNN accelerator design for better efficiency, we propose an analytical cost model, MAESTRO. MAESTRO receives DNN model description and hardware resources information as a list, and mapping described in a data-centric representation we propose as inputs.

Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search

Modern day computing increasingly relies on specialization to satiate growing performance and efficiency requirements. A core challenge in designing such specialized hardware architectures is how to perform mapping space search, i.e., search for an optimal mapping from algorithm to hardware. Prior work shows that choosing an inefficient mapping can lead to multiplicative-factor efficiency overheads. Additionally, the search space is not only large but also non-convex and non-smooth, precluding advanced search techniques.