  Steve Keckler  

 



  ![](/sites/default/files/person/SteveKeckler_WebRes_thumbnail_1.jpg)

  

 Steve Keckler joined NVIDIA in 2009 and leads the Architecture Research Group. He is also an Adjunct Professor of Computer Science at the University of Texas at Austin, where he served on the faculty from 1998-2012. His research interests include parallel computer architectures, high-performance computing, energy-efficient architectures, and embedded computing. Dr. Keckler was previously at the Massachusetts Institute of Technology from 1990 to 1998, where he led the development of the M-Machine experimental parallel computer system. He is a Fellow of the ACM, a Fellow of the IEEE, an Alfred P. Sloan Research Fellow, and a recipient of the NSF CAREER award, the ACM Grace Murray Hopper award, the President's Associates Teaching Excellence Award at UT-Austin, and the Edith and Peter O’Donnell award for Engineering. He earned a B.S. in Electrical Engineering from Stanford University and an M.S. and a Ph.D. in Computer Science from the Massachusetts Institute of Technology. [Full list of publications](http://www.cs.utexas.edu/users/skeckler)



   Research Area(s)

[Computer Architecture](/index.php/research-area/computer-architecture)

[Computer Vision](/index.php/research-area/computer-vision)

[High Performance Computing](/index.php/research-area/high-performance-computing)

[Artificial Intelligence and Machine Learning ](/index.php/research-area/machine-learning-artificial-intelligence)

[Resilience and Safety](/index.php/research-area/resilience)

 

 

  

 Main Field of Interest

[Computer Architecture](/index.php/research-area/computer-architecture)

 

  

 Google Scholar

[https://scholar.google.com/citations?user=PpjjRvoAAAAJ&amp;hl=en](https://scholar.google.com/citations?user=PpjjRvoAAAAJ&hl=en)

 

  

 

 

 



 ### Publications

 

### 2023 

[VaPr: Variable-Precision Tensors to Accelerate Robot Motion Planning](/publication/2023-10_vapr-variable-precision-tensors-accelerate-robot-motion-planning)

Yu-Shun Hsiao, [Siva Hari](/person/siva-hari), [Balakumar Sundaralingam](/person/balakumar-sundaralingam), Jason Yik, Thierry Tambe, [Charbel Sakr](/person/charbel-sakr), [Steve Keckler](/person/stephen-keckler), Vijay Janapa Reddi



[IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023)](https://ieee-iros.org/)









[cuCatch: A Debugging Tool for Efficiently Catching Memory Safety Violations in CUDA Applications](/publication/2023-06_cucatch-debugging-tool-efficiently-catching-memory-safety-violations-cuda)

[Mohamed Tarek Ibn Ziad](/person/mohamed-tarek-ibn-ziad), [Sana Damani](/person/sana-damani), [Aamer Jaleel](/person/aamer-jaleel), [Stephen W. Keckler](/person/stephen-keckler), [Mark Stephenson](/person/mark-stephenson)



[ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)](https://dl.acm.org/doi/10.1145/3591225)









[Implicit Memory Tagging: No-Overhead Memory Safety Using Alias-Free Tagged ECC](/index.php/publication/2023-06_implicit-memory-tagging-no-overhead-memory-safety-using-alias-free-tagged-ecc)

[Michael B. Sullivan](/index.php/person/mike-sullivan), [Mohamed Tarek Ibn Ziad](/index.php/person/mohamed-tarek-ibn-ziad), [Aamer Jaleel](/index.php/person/aamer-jaleel), [Stephen W. Keckler](/index.php/person/stephen-keckler)



[International Symposium on Computer Architecture (ISCA)](https://dl.acm.org/doi/abs/10.1145/3579371.3589102)









### 2022 

[Augmenting Legacy Networks for Flexible Inference.](/publication/2022-10_augmenting-legacy-networks-flexible-inference)

[Jason Clemons](/person/jason-clemons), [Iuri Frosio](/person/iuri-frosio), Maying Shen, Jose M. Alvarez, [Steve Keckler](/person/stephen-keckler)



[Workshop on Computational Aspects of Deep Learning (CADL)](https://ailb-web.ing.unimore.it/cadl2022/)









[Zhuyi: Perception Processing Rate Estimation for Safety in Autonomous Vehicles](/publication/2022-07_zhuyi-perception-processing-rate-estimation-safety-autonomous-vehicles)

Yu-Shun Hsiao, [Siva Hari](/person/siva-hari), Michał Filipiuk, Timothy Tsai, [Michael B. Sullivan](/person/mike-sullivan), Vijay Janapa Reddi, Vasu Singh, [Steve Keckler](/person/stephen-keckler)



[Design Automation Conference (DAC)](https://dl.acm.org/doi/10.1145/3489517.3530445)









[Exploiting Temporal Data Diversity for Detecting Safety-critical Faults in AV Compute Systems](/publication/2022-06_exploiting-temporal-data-diversity-detecting-safety-critical-faults-av-compute)

Saurabh Jha, Shengkun Cui, Timothy Tsai, [Siva Hari](/person/siva-hari), [Michael B. Sullivan](/person/mike-sullivan), Zbigniew T. Kalbarczyk, [Steve Keckler](/person/stephen-keckler), Ravishankar K. Iyer



[International Conference on Dependable Systems and Networks (DSN)](https://ieeexplore.ieee.org/document/9833576)









[Zhuyi: Perception Processing Rate Estimation for Safety in Autonomous Vehicles](/publication/2022-05_zhuyi-perception-processing-rate-estimation-safety-autonomous-vehicles)

Yu-Shun Hsiao, [Siva Hari](/person/siva-hari), Michał Filipiuk, Timothy Tsai, [Michael B. Sullivan](/person/mike-sullivan), Vijay Janapa Reddi, Vasu Singh, [Steve Keckler](/person/stephen-keckler)



[arXiv](https://arxiv.org/abs/2205.03347)









[Saving PAM4 Bus Energy with SMOREs: Sparse Multi-level Opportunistic Restricted Encodings](/publication/2022-04_saving-pam4-bus-energy-smores-sparse-multi-level-opportunistic-restricted)

[Mike O'Connor](/person/mike-o-connor), [Donghyuk Lee](/person/donghyuk-lee), [Niladrish Chatterjee](/person/niladrish-chatterjee), [Michael B. Sullivan](/person/mike-sullivan), [Steve Keckler](/person/stephen-keckler)



[International Symposium on High-Performance Computer Architecture (HPCA)](https://ieeexplore.ieee.org/document/9773229)









[Characterizing and Mitigating Soft Errors in GPU DRAM](/publication/2022-03_characterizing-and-mitigating-soft-errors-gpu-dram)

[Michael B. Sullivan](/person/mike-sullivan), Nirmal R. Saxena, [Mike O'Connor](/person/mike-o-connor), [Donghyuk Lee](/person/donghyuk-lee), Paul Racunas, Saurabh Hukerikar, Timothy Tsai, [Siva Kumar Sastry Hari](/person/siva-hari), [Stephen W. Keckler](/person/stephen-keckler)



[IEEE Micro (Issue: Top Picks of the 2021 Computer Architecture Conferences)](https://ieeexplore.ieee.org/document/9744333)









[GPU Subwarp Interleaving](/publication/2022-01_gpu-subwarp-interleaving)

Sana Damani, [Mark Stephenson](/person/mark-stephenson), Ram Rangan, Daniel Johnson, Rishkul Kulkarni, [Steve Keckler](/person/stephen-keckler)



[International Symposium on High-Performance Computer Architecture (HPCA)](https://ieeexplore.ieee.org/document/9773183)









[Accelerators](/publication/2022-01_accelerators)

[Steve Keckler](/person/stephen-keckler), Dejan Milojicic



[IEEE Computer](https://ieeexplore.ieee.org/document/9681667)









### 2021 

[GPU Domain Specialization via Composable On-Package Architecture](/index.php/publication/2021-12_gpu-domain-specialization-composable-package-architecture)

[Yaosheng Fu](/index.php/person/yaosheng-fu), Evgeny Bolotin, [Niladrish Chatterjee](/index.php/person/niladrish-chatterjee), [David Nellans](/index.php/person/david-nellans), [Steve Keckler](/index.php/person/stephen-keckler)



[ACM Transactions on Architecture and Code Optimization (TACO)](https://dl.acm.org/doi/full/10.1145/3484505)









[Evolution of the Graphics Processing Unit (GPU)](/publication/2021-12_evolution-graphics-processing-unit-gpu)

[William Dally](/person/william-dally), [Steve Keckler](/person/stephen-keckler), David B. Kirk



[IEEE Micro Special Issue of the 50th Anniversary of the Microprocessor](https://ieeexplore.ieee.org/document/9623445)









[Optimizing Selective Protection for CNN Resilience](/publication/2021-10_optimizing-selective-protection-cnn-resilience)

Abdulrahman Mahmoud, [Siva Hari](/person/siva-hari), Christopher W. Fletcher, Sarita V. Adve, [Charbel Sakr](/person/charbel-sakr), Naresh Shanbhag, [Pavlo Molchanov](/person/pavlo-molchanov), [Michael B. Sullivan](/person/mike-sullivan), Timothy Tsai, [Steve Keckler](/person/stephen-keckler)



[International Symposium on Software Reliability Engineering (ISSRE)](https://ieeexplore.ieee.org/document/9700317)









[Suraksha: A Framework to Analyze the Safety Implications of Perception Design Choices in AVs](/publication/2021-10_suraksha-framework-analyze-safety-implications-perception-design-choices-avs)

Hengyu Zhao, [Siva Hari](/person/siva-hari), Timothy Tsai, [Michael B. Sullivan](/person/mike-sullivan), [Steve Keckler](/person/stephen-keckler), Jishen Zhao



[International Symposium on Software Reliability Engineering (ISSRE)](https://ieeexplore.ieee.org/abstract/document/9700341)









[Characterizing and Mitigating Soft Errors in GPU DRAM](/publication/2021-10_characterizing-and-mitigating-soft-errors-gpu-dram-0)

[Michael B. Sullivan](/person/mike-sullivan), Nirmal Saxena, [Mike O'Connor](/person/mike-o-connor), [Donghyuk Lee](/person/donghyuk-lee), Paul Racunas, Saurabh Hukerikar, Timothy Tsai, [Siva Hari](/person/siva-hari), [Steve Keckler](/person/stephen-keckler)



[International Symposium on Microarchitecture (MICRO)](https://dl.acm.org/doi/10.1145/3466752.3480111)



IEEE Micro Top Picks in Computer Architecture





[Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles](/publication/2021-07_generating-and-characterizing-scenarios-safety-testing-autonomous-vehicles)

Zahra Ghodsi, [Siva Hari](/person/siva-hari), [Iuri Frosio](/person/iuri-frosio), Timothy Tsai, Alejandro Troccoli, [Steve Keckler](/person/stephen-keckler), Siddharth Garg, Anima Anandkumar



[IEEE Intelligent Vehicles Symposium (IV)](https://ieeexplore.ieee.org/document/9576023)









[Cooperative Profile Guided Optimization](/publication/2021-07_cooperative-profile-guided-optimization)

[Mark Stephenson](/person/mark-stephenson), Ram Rangan, [Steve Keckler](/person/stephen-keckler)



[ Computer Graphics Forum (Proceedings of High Performance Graphics)](https://www.highperformancegraphics.org/2021/)









[NVBitFI: Dynamic Fault Injection for GPUs](/publication/2021-06_nvbitfi-dynamic-fault-injection-gpus)

Timothy Tsai, [Siva Hari](/person/siva-hari), [Michael B. Sullivan](/person/mike-sullivan), Oreste Villa, [Steve Keckler](/person/stephen-keckler)



[International Conference on Dependable Systems and Networks (DSN)](https://ieeexplore.ieee.org/abstract/document/9505068)









[Suraksha: A Quantitative AV Safety Evaluation Framework to Analyze Safety Implications of Perception Design Choices](/publication/2021-06_suraksha-quantitative-av-safety-evaluation-framework-analyze-safety)

Hengyu Zhao, [Siva Hari](/person/siva-hari), Timothy Tsai, [Michael B. Sullivan](/person/mike-sullivan), [Steve Keckler](/person/stephen-keckler), Jishen Zhao



[Workshop on Safety and Security of Intelligent Vehicles (SSIV)](https://ieeexplore.ieee.org/document/9502467)









[Simba: scaling deep-learning inference with chiplet-based architecture](/publication/2021-05_simba-scaling-deep-learning-inference-chiplet-based-architecture)

Yakun Sophia Shao, [Jason Clemons](/person/jason-clemons), [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Brian Zimmer](/person/brian-zimmer), [Matt Fojtik](/person/matt-fojtik), [Ted Jiang](/person/ted-jiang), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Stephen Tell](/person/stephen-tell), [Yanqing Zhang](/person/yanqing-zhang), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Tom Gray](/person/tom-gray), [Brucek Khailany](/person/brucek-khailany), [Steve Keckler](/person/stephen-keckler)



[Communications of the ACM](https://dl.acm.org/doi/10.1145/3460227)



ACM Research Highlight





[GPU Domain Specialization via Composable On-Package Architecture](/publication/2021-04_gpu-domain-specialization-composable-package-architecture)

[Yaosheng Fu](/person/yaosheng-fu), Evgeny Bolotin, [Niladrish Chatterjee](/person/niladrish-chatterjee), [David Nellans](/person/david-nellans), [Steve Keckler](/person/stephen-keckler)



[arXiv](https://arxiv.org/abs/2104.02188)









[Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles](/publication/2021-03_generating-and-characterizing-scenarios-safety-testing-autonomous-vehicles)

Zahra Ghodsi, [Siva Hari](/person/siva-hari), [Iuri Frosio](/person/iuri-frosio), Timothy Tsai, Alejandro Troccoli, [Steve Keckler](/person/stephen-keckler), Siddharth Garg, Anima Anandkumar



[arXiv](https://arxiv.org/abs/2103.07403)









[Making Convolutions Resilient via Algorithm-Based Error Detection Techniques](/publication/2021-03_making-convolutions-resilient-algorithm-based-error-detection-techniques)

[Siva Hari](/person/siva-hari), [Michael B. Sullivan](/person/mike-sullivan), Timothy Tsai, [Steve Keckler](/person/stephen-keckler)



[IEEE Transactions on Dependable and Secure Computing (TDSC)](https://ieeexplore.ieee.org/document/9366780)









[SNAP: An Efficient Sparse Neural Acceleration Processor for Unstructured Sparse Deep Neural Network Inference](/publication/2021-02_snap-efficient-sparse-neural-acceleration-processor-unstructured-sparse-deep)

Jie-Fang Zhang, Ching-En Lee, Chester Liu, Yakun Sophia Shao, [Steve Keckler](/person/stephen-keckler), Zhengya Zhang



[IEEE Journal of Solid-State Circuits (JSSC)](https://ieeexplore.ieee.org/document/9310233)









### 2020 

[Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles](/publication/2020-11_generating-and-characterizing-scenarios-safety-testing-autonomous-vehicles-0)

Zahra Ghodsi, [Siva Hari](/person/siva-hari), [Iuri Frosio](/person/iuri-frosio), Timothy Tsai, Alejandro Troccoli, [Steve Keckler](/person/stephen-keckler), Siddharth Garg, Anima Anandkumar



[IEEE International Workshop on Automotive Reliability &amp; Test (ART)](https://www.ieee-tttc.org/ebshistory/2020/%5BPUB%5D%5BART_2020%5D_Fifth_IEEE_International_Workshop_on_Automotive_Reliability_&_Test_-_CALL_FOR_SUBMISSIONS_-_%22IEEE_TTTC's_EBS%22_%3Cebs@ieee-tttc.org%3E_-_2020-09-08_1209.eml.html)









[Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles](/publication/2020-11_generating-and-characterizing-scenarios-safety-testing-autonomous-vehicles)

Zahra Ghodsi, [Siva Hari](/person/siva-hari), [Iuri Frosio](/person/iuri-frosio), Timothy Tsai, Alejandro Troccoli, [Steve Keckler](/person/stephen-keckler), Siddharth Garg, Anima Anandkumar



[IEEE Automotive Reliability and Test Workshop](http://cas.polito.it/ART2020/)









[HarDNN: Fine-Grained Vulnerability Evaluation and Protection for Convolutional Neural Networks](/publication/2020-09_hardnn-fine-grained-vulnerability-evaluation-and-protection-convolutional)

Abdulrahman Mahmoud, [Siva Hari](/person/siva-hari), Christopher W. Fletcher, Sarita V. Adve, Charbel Sakr, Naresh Shanbhag, [Pavlo Molchanov](/person/pavlo-molchanov), [Michael B. Sullivan](/person/mike-sullivan), Timothy Tsai, [Steve Keckler](/person/stephen-keckler)



[SRC TECHCON](https://src.secure-platform.com/a/page/techcon)









[Making Convolutions Resilient via Algorithm-Based Error Detection Techniques](/publication/2020-06_making-convolutions-resilient-algorithm-based-error-detection-techniques)

[Siva Hari](/person/siva-hari), [Michael B. Sullivan](/person/mike-sullivan), Timothy Tsai, [Steve Keckler](/person/stephen-keckler)



[arXiv](https://arxiv.org/abs/2006.04984)









[Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs](/publication/2020-06_buddy-compression-enabling-larger-memory-deep-learning-and-hpc-workloads-gpus)

Esha Chouske, [Michael B. Sullivan](/person/mike-sullivan), [Mike O'Connor](/person/mike-o-connor), Mattan Erez, Jeff Pool, [David Nellans](/person/david-nellans), [Steve Keckler](/person/stephen-keckler)



[International Symposium on Computer Architecture (ISCA)](https://ieeexplore.ieee.org/document/9138915)









[Estimating Silent Data Corruption Rates Using a Two-Level Model](/publication/2020-04_estimating-silent-data-corruption-rates-using-two-level-model)

[Siva Hari](/person/siva-hari), Paolo Rech, Timothy Tsai, [Mark Stephenson](/person/mark-stephenson), Arslan Zulfiqar, [Michael B. Sullivan](/person/mike-sullivan), Philip Shirvani, Paul Racunas, [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler)



[arXiv](https://arxiv.org/abs/2005.01445)









[Feature Map Vulnerability Evaluation in CNNs](/publication/2020-03_feature-map-vulnerability-evaluation-cnns)

Abdulrahman Mahmoud, [Siva Hari](/person/siva-hari), Christopher W. Fletcher, Sarita V. Adve, Charbel Sakr, Naresh Shanbhag, [Pavlo Molchanov](/person/pavlo-molchanov), [Michael B. Sullivan](/person/mike-sullivan), Timothy Tsai, [Steve Keckler](/person/stephen-keckler)



[Workshop on Secure and Resilient Autonomy](http://sara-workshop.org/)









[HarDNN: Feature Map Vulnerability Evaluation in CNNs](/publication/2020-02_hardnn-feature-map-vulnerability-evaluation-cnns)

Abdulrahman Mahmoud, [Siva Hari](/person/siva-hari), Christopher W. Fletcher, Sarita V. Adve, Charbel Sakr, Naresh Shanbhag, [Pavlo Molchanov](/person/pavlo-molchanov), [Michael B. Sullivan](/person/mike-sullivan), Timothy Tsai, [Steve Keckler](/person/stephen-keckler)



[arXiv](https://arxiv.org/abs/2002.09786)









[Speculative Reconvergence for Improved SIMT Efficiency](/index.php/publication/2020-02_speculative-reconvergence-improved-simt-efficiency)

Sana Damani, Daniel Johnson, [Mark Stephenson](/index.php/person/mark-stephenson), Eddie Yan, Olivier Giroux, Michael McKeown, [Steve Keckler](/index.php/person/stephen-keckler)



[International Symposium on Code Generation and Optimization](https://dl.acm.org/doi/10.1145/3368826.3377911)









[A 0.32–128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Network Inference Accelerator With Ground-Referenced Signaling in 16 nm](/publication/2020-01_032-128-tops-scalable-multi-chip-module-based-deep-neural-network-inference)

[Brian Zimmer](/person/brian-zimmer), [Rangharajan Venkatesan](/person/rangharajan-venkatesan), Yakun Sophia Shao, [Jason Clemons](/person/jason-clemons), [Matt Fojtik](/person/matt-fojtik), [Ted Jiang](/person/ted-jiang), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Stephen Tell](/person/stephen-tell), [Yanqing Zhang](/person/yanqing-zhang), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Tom Gray](/person/tom-gray), [Steve Keckler](/person/stephen-keckler), [Brucek Khailany](/person/brucek-khailany)



[IEEE Journal of Solid-State Circuits (JSSC)](https://ieeexplore.ieee.org/document/8959403)



JSSC 2020 Best Paper award





### 2019 

[MAGNet: A Modular Accelerator Generator for Neural Networks](/publication/2019-11_magnet-modular-accelerator-generator-neural-networks)

[Rangharajan Venkatesan](/person/rangharajan-venkatesan), Sophia Shao, Miaorong Wang, [Jason Clemons](/person/jason-clemons), [Steve Dai](/person/steve-dai), [Matt Fojtik](/person/matt-fojtik), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Yanqing Zhang](/person/yanqing-zhang), [Brian Zimmer](/person/brian-zimmer), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler), [Brucek Khailany](/person/brucek-khailany)



[International Conference On Computer Aided Design (ICCAD)](https://ieeexplore.ieee.org/document/8942127)









[NVBit: A Dynamic Binary Instrumentation Framework for NVIDIA GPUs](/publication/2019-10_nvbit-dynamic-binary-instrumentation-framework-nvidia-gpus)

Oreste Villa, [Mark Stephenson](/person/mark-stephenson), [David Nellans](/person/david-nellans), [Steve Keckler](/person/stephen-keckler)



[International Symposium on Microarchitecture (MICRO)](https://doi.org/10.1145/3352460.3358307)









[Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture](/publication/2019-10_simba-scaling-deep-learning-inference-multi-chip-module-based-architecture)

Sophia Shao, [Jason Clemons](/person/jason-clemons), [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Brian Zimmer](/person/brian-zimmer), [Matt Fojtik](/person/matt-fojtik), [Ted Jiang](/person/ted-jiang), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Stephen Tell](/person/stephen-tell), [Yanqing Zhang](/person/yanqing-zhang), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Tom Gray](/person/tom-gray), [Brucek Khailany](/person/brucek-khailany), [Steve Keckler](/person/stephen-keckler)



[International Symposium on Microarchitecture (MICRO)](https://dl.acm.org/doi/10.1145/3352460.3358302)



Best Paper award, IEEE Micro Top Picks in Computer Architecture (Honorable Mention)





[A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator Designed with a High-Productivity VLSI Methodology](/publication/2019-08_011-pjop-032-128-tops-scalable-multi-chip-module-based-deep-neural-network)

[Rangharajan Venkatesan](/person/rangharajan-venkatesan), Sophia Shao, [Brian Zimmer](/person/brian-zimmer), [Jason Clemons](/person/jason-clemons), [Matt Fojtik](/person/matt-fojtik), [Ted Jiang](/person/ted-jiang), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Stephen Tell](/person/stephen-tell), [Yanqing Zhang](/person/yanqing-zhang), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Tom Gray](/person/tom-gray), [Steve Keckler](/person/stephen-keckler), [Brucek Khailany](/person/brucek-khailany)



[Hot Chips: A Symposium on High Performance Chips](http://www.hotchips.org/)









[Kayotee: A Fault Injection-based System to Assess the Safety and Reliability of Autonomous Vehicles to Faults and Errors](/publication/2019-07_kayotee-fault-injection-based-system-assess-safety-and-reliability-autonomous)

Saurabh Jha, Timothy Tsai, [Siva Hari](/person/siva-hari), [Michael B. Sullivan](/person/mike-sullivan), Zbigniew Kalbarczyk, [Steve Keckler](/person/stephen-keckler), Ravishankar K. Iyer



[arXiv](https://arxiv.org/abs/1907.01024)









[ML-based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection](/publication/2019-07_ml-based-fault-injection-autonomous-vehicles-case-bayesian-fault-injection)

Saurabh Jha, Subho S. Banerjee, Timothy Tsai, [Siva Hari](/person/siva-hari), [Michael B. Sullivan](/person/mike-sullivan), Zbigniew T. Kalbarczyk, [Steve Keckler](/person/stephen-keckler), Ravishankar K. Iyer



[arXiv](https://arxiv.org/abs/1907.01051)









[GPU Snapshot: Checkpoint Offloading for GPU-Dense Systems](/publication/2019-06_gpu-snapshot-checkpoint-offloading-gpu-dense-systems)

Kyushick Lee, [Michael B. Sullivan](/person/mike-sullivan), [Siva Hari](/person/siva-hari), Timothy Tsai, [Steve Keckler](/person/stephen-keckler), Mattan Erez



[International Conference on Supercomputing](https://dl.acm.org/doi/10.1145/3330345.3330361)









[ML-based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection](/publication/2019-06_ml-based-fault-injection-autonomous-vehicles-case-bayesian-fault-injection)

Saurabh Jha, Subho Banerjee, Timothy Tsai, [Siva Hari](/person/siva-hari), [Michael B. Sullivan](/person/mike-sullivan), Zbigniew T. Kalbarczyk, [Steve Keckler](/person/stephen-keckler), Ravishankar K. Iyer



[International Conference on Dependable Systems and Networks (DSN)](https://ieeexplore.ieee.org/abstract/document/8809495)









[On the Trend of Resilience for GPU-Dense Systems](/publication/2019-06_trend-resilience-gpu-dense-systems)

Kyushick Lee, [Michael B. Sullivan](/person/mike-sullivan), [Siva Hari](/person/siva-hari), Timothy Tsai, [Steve Keckler](/person/stephen-keckler), Mattan Erez



[International Conference on Dependable Systems and Networks, Supplemental (DSN-…](https://ieeexplore.ieee.org/document/8805794)



Best of SELSE (Workshop on Silicon Errors in Logic - System Effects)





[A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator with Ground-Reference Signaling in 16nm](/publication/2019-06_011-pjop-032-128-tops-scalable-multi-chip-module-based-deep-neural-network)

[Brian Zimmer](/person/brian-zimmer), [Rangharajan Venkatesan](/person/rangharajan-venkatesan), Sophia Shao, [Jason Clemons](/person/jason-clemons), [Matt Fojtik](/person/matt-fojtik), [Ted Jiang](/person/ted-jiang), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Stephen Tell](/person/stephen-tell), [Yanqing Zhang](/person/yanqing-zhang), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Tom Gray](/person/tom-gray), [Steve Keckler](/person/stephen-keckler), [Brucek Khailany](/person/brucek-khailany)



[Symposium on VLSI Circuits](https://ieeexplore.ieee.org/document/8778056)









[SNAP: A 1.67 – 21.55 TOPS/W Sparse Neural Acceleration Processor for Unstructured Sparse Deep Neural Network Inference in 16nm CMOS](/publication/2019-06_snap-167-2155-topsw-sparse-neural-acceleration-processor-unstructured-sparse)

Jie-Fang Zhang, Ching-En Lee, Chester Liu, Yakun Sophia Shao, [Steve Keckler](/person/stephen-keckler), Zhengya Zhang



[Symposia on VLSI Technology and Circuits (VLSI)](https://ieeexplore.ieee.org/document/8778193)









[Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration](/publication/2019-04_buffets-efficient-and-composable-storage-idiom-explicit-decoupled-data)

[Michael Pellauer](/person/michael-pellauer), Yakun Sophia Shao, [Jason Clemons](/person/jason-clemons), [Neal Crago](/person/neal-crago), Kartik Hegde, [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Steve Keckler](/person/stephen-keckler), Christopher W. Fletcher, [Joel Emer](/person/joel-emer)



[International Conference on Architectural Support for Programming Languages and…](https://dl.acm.org/doi/10.1145/3297858.3304025)



IEEE Micro Top Picks in Computer Architecture (Honorable Mention)





[On the Trend of Resilience for GPU-Dense Systems](/publication/2019-03_trend-resilience-gpu-dense-systems)

Kyushick Lee, [Michael B. Sullivan](/person/mike-sullivan), [Siva Hari](/person/siva-hari), Timothy Tsai, [Steve Keckler](/person/stephen-keckler), Mattan Erez



[IEEE Workshop on Silicon Errors in Logic – System Effects (SELSE)](https://selse.org/2019-archive/)



Award paper





[Timeloop: A Systematic Approach to DNN Accelerator Evaluation](/publication/2019-03_timeloop-systematic-approach-dnn-accelerator-evaluation)

[Angshuman Parashar](/person/angshuman-parashar), Priyanka Raina, Yakun Sophia Shao, Yu-Hsin Chen, Victor A. Ying, Anurag Mukkara, [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Brucek Khailany](/person/brucek-khailany), [Steve Keckler](/person/stephen-keckler), [Joel Emer](/person/joel-emer)



[International Symposium on Performance Analysis of Systems and Software (ISPASS)](https://ieeexplore.ieee.org/document/8695666)









### 2018 

[Optimizing Software-Directed Instruction Replication for GPU Error Detection ](/publication/2018-11_optimizing-software-directed-instruction-replication-gpu-error-detection)

Abdulrahman Mahmoud, [Siva Hari](/person/siva-hari), [Michael B. Sullivan](/person/mike-sullivan), Timothy Tsai, [Steve Keckler](/person/stephen-keckler)



[ International Conference for High-Performance Computing, Networking, Storage a…](https://dl.acm.org/doi/10.5555/3291656.3291746)









[Kayotee: A Fault Injection-based System to Assess the Safety and Reliability of Autonomous Vehicles to Faults and Errors](/publication/2018-11_kayotee-fault-injection-based-system-assess-safety-and-reliability-autonomous)

Saurabh Jha, Timothy Tsai, [Siva Hari](/person/siva-hari), [Michael B. Sullivan](/person/mike-sullivan), Zbigniew Kalbarczyk, [Steve Keckler](/person/stephen-keckler), Ravishankar K. Iyer



[Third IEEE International Workshop on Automotive Reliability &amp; Test](http://www.lirmm.fr/art18/)









[SwapCodes: Error Codes for Hardware-Software Cooperative GPU Pipeline Error Detection](/publication/2018-10_swapcodes-error-codes-hardware-software-cooperative-gpu-pipeline-error)

[Michael B. Sullivan](/person/mike-sullivan), [Siva Hari](/person/siva-hari), [Brian Zimmer](/person/brian-zimmer), Timothy Tsai, [Stephen W. Keckler](/person/stephen-keckler)



[The International Symposium on Microarchitecture (MICRO)](https://ieeexplore.ieee.org/document/8574584)









[Exposing Memory Access Patterns to Improve Instruction and Memory Efficiency in GPUs](/publication/2018-10_exposing-memory-access-patterns-improve-instruction-and-memory-efficiency-gpus)

[Neal Crago](/person/neal-crago), [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler)



[ACM Transactions on Architecture and Code Optimization (TACO)](https://doi.org/10.1145/3280851)









[Software-Directed Techniques for Improved GPU Register File Utilization](/publication/2018-09_software-directed-techniques-improved-gpu-register-file-utilization)

Dani Voitsechov, Arslan Zulfiqar, [Mark Stephenson](/person/mark-stephenson), Mark Gebhart, [Steve Keckler](/person/stephen-keckler)



[ACM Transactions on Architecture and Code Optimization (TACO)](https://dl.acm.org/doi/10.1145/3243905)









[Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training](/publication/2018-06_structurally-sparsified-backward-propagation-faster-long-short-term-memory)

Maohua Zhu, [Jason Clemons](/person/jason-clemons), Jeff Pool, Minsoo Rhu, [Steve Keckler](/person/stephen-keckler), Yuan Xie



[arXiv](https://arxiv.org/abs/1806.00512)









[Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks](/publication/2018-02_compressing-dma-engine-leveraging-activation-sparsity-training-deep-neural)

Minsoo Rhu, [Mike O'Connor](/person/mike-o-connor), [Niladrish Chatterjee](/person/niladrish-chatterjee), Jeff Pool, Youngeun Kwon, [Steve Keckler](/person/stephen-keckler)



[International Symposium on High Performance Computer Architecture (HPCA)](https://ieeexplore.ieee.org/document/8327000)









[Stitch-X: An Accelerator Architecture for Exploiting Unstructured Sparsity in Deep Neural Networks](/publication/2018-02_stitch-x-accelerator-architecture-exploiting-unstructured-sparsity-deep-neural)

Ching-En Lee, Yakun Sophia Shao, Jie-Fang Zhang, [Angshuman Parashar](/person/angshuman-parashar), [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler), Zhengya Zhang



[SysML Conference](https://mlsys.org/Conferences/2018/index.html#posters)









### 2017 

[Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications](/publication/2017-11_understanding-error-propagation-deep-learning-neural-network-dnn-accelerators)

Guanpeng Li, [Siva Hari](/person/siva-hari), [Michael B. Sullivan](/person/mike-sullivan), Timothy Tsai, Karthik Pattabiraman, [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler)



[The International Conference for High Performance Computing, Networking, Storag…](https://dl.acm.org/doi/10.1145/3126908.3126964)









[Fine-Grained DRAM: Energy-Efficient DRAM for Extreme Bandwidth Systems](/publication/2017-10_fine-grained-dram-energy-efficient-dram-extreme-bandwidth-systems)

[Mike O'Connor](/person/mike-o-connor), [Niladrish Chatterjee](/person/niladrish-chatterjee), [Donghyuk Lee](/person/donghyuk-lee), [John Wilson](/person/john-wilson), Aditya Agrawal, [Steve Keckler](/person/stephen-keckler), [William Dally](/person/william-dally)



[International Symposium on Microarchitecture (MICRO)](https://dl.acm.org/citation.cfm?id=3124545)









[SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks](/publication/2017-06_scnn-accelerator-compressed-sparse-convolutional-neural-networks)

[Angshuman Parashar](/person/angshuman-parashar), Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Brucek Khailany](/person/brucek-khailany), [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler), [William Dally](/person/william-dally)



[International Symposium on Computer Architecture (ISCA)](https://dl.acm.org/doi/10.1145/3079856.3080254)









[SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks](/publication/2017-05_scnn-accelerator-compressed-sparse-convolutional-neural-networks)

[Angshuman Parashar](/person/angshuman-parashar), Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Brucek Khailany](/person/brucek-khailany), [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler), [William Dally](/person/william-dally)



[arXiv](https://arxiv.org/abs/1708.04485)









[Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks](/publication/2017-05_compressing-dma-engine-leveraging-activation-sparsity-training-deep-neural)

Minsoo Rhu, [Mike O'Connor](/person/mike-o-connor), [Niladrish Chatterjee](/person/niladrish-chatterjee), Jeff Pool, [Stephen W. Keckler](/person/stephen-keckler)



[arXiv](https://arxiv.org/abs/1705.01626)









[SASSIFI: An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluation](/publication/2017-04_sassifi-architecture-level-fault-injection-tool-gpu-application-resilience)

[Siva Hari](/person/siva-hari), Timothy Tsai, [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler), [Joel Emer](/person/joel-emer)



[International Symposium on Performance Analysis of Systems and Software (ISPASS)](https://ieeexplore.ieee.org/document/7975296)









[Architecting an Energy-Efficient DRAM System for GPUs](/publication/2017-02_architecting-energy-efficient-dram-system-gpus)

[Niladrish Chatterjee](/person/niladrish-chatterjee), [Mike O'Connor](/person/mike-o-connor), [Donghyuk Lee](/person/donghyuk-lee), Daniel Johnson, Minsoo Rhu, [Steve Keckler](/person/stephen-keckler), [William Dally](/person/william-dally)



[International Symposium on High Performance Computer Architecture (HPCA)](http://ieeexplore.ieee.org/document/7920815/)









### 2016 

[vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design.](/publication/2016-10_vdnn-virtualized-deep-neural-networks-scalable-memory-efficient-neural-network)

Minsoo Rhu, Natalia Gimelshein, [Jason Clemons](/person/jason-clemons), Arslan Zulfiqar, [Steve Keckler](/person/stephen-keckler)



[International Symposium on Microarchitecture (MICRO)](https://dl.acm.org/doi/10.5555/3195638.3195660)









[A Patch Memory System For Image Processing and Computer Vision.](/publication/2016-10_patch-memory-system-image-processing-and-computer-vision)

[Jason Clemons](/person/jason-clemons), Chih-Chi Cheng, [Iuri Frosio](/person/iuri-frosio), Daniel Johnson, [Steve Keckler](/person/stephen-keckler)



[International Symposium on Microarchitecture (MICRO)](https://ieeexplore.ieee.org/document/7783754)









[CLARA: Circular Linked-List Auto- and Self-Refresh Architecture](/publication/2016-10_clara-circular-linked-list-auto-and-self-refresh-architecture)

Aditya Agrawal, [Mike O'Connor](/person/mike-o-connor), Evgeny Bolotin, [Niladrish Chatterjee](/person/niladrish-chatterjee), [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler)



[International Symposium on Memory Systems (MEMSYS'16)](https://dl.acm.org/doi/10.1145/2989081.2989084)









[Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems](/publication/2016-06_transparent-offloading-and-mapping-tom-enabling-programmer-transparent-near)

Kevin Hsieh, Eiman Ebrahimi, Gwangsun Kim, [Niladrish Chatterjee](/person/niladrish-chatterjee), [Mike O'Connor](/person/mike-o-connor), Nandita Vijaykumar, Onur Mutlu, [Steve Keckler](/person/stephen-keckler)



[International Symposium on Computer Architecture (ISCA)](http://ieeexplore.ieee.org/document/7551394/)









[A Real-time Energy-Efficient Superpixel Hardware Accelerator for Mobile Computer Vision Applications](/index.php/publication/2016-06_real-time-energy-efficient-superpixel-hardware-accelerator-mobile-computer)

Injoon Hong, [Jason Clemons](/index.php/person/jason-clemons), [Rangharajan Venkatesan](/index.php/person/rangharajan-venkatesan), [Iuri Frosio](/index.php/person/iuri-frosio), [Brucek Khailany](/index.php/person/brucek-khailany), [Steve Keckler](/index.php/person/stephen-keckler)



[Design Automation Conference (DAC)](http://dl.acm.org/citation.cfm?id=2897974)









[Towards High Performance Paged Memory for GPUs](/publication/2016-03_towards-high-performance-paged-memory-gpus)

Tianhao Zheng, [David Nellans](/person/david-nellans), Arslan Zulfiqar, [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler)



[International Symposium on High Performance Computer Architecture (HPCA)](https://ieeexplore.ieee.org/document/7446077)









[A Case for Toggle-Aware Compression for GPU Systems](/publication/2016-03_case-toggle-aware-compression-gpu-systems)

Gennady Pekhimenko, Evgeny Bolotin, Nandita Vijaykumar, Onur Mutlu, Todd C. Mowry, [Steve Keckler](/person/stephen-keckler)



[International Symposium on High Performance Computer Architecture (HPCA)](http://ieeexplore.ieee.org/document/7446064/)









[Selective GPU Caches to Eliminate CPU-GPU HW Cache Coherence](/publication/2016-03_selective-gpu-caches-eliminate-cpu-gpu-hw-cache-coherence)

Neha Agarwal, [David Nellans](/person/david-nellans), Eiman Ebrahimi, Thomas F. Wenisch, John Danskin, [Steve Keckler](/person/stephen-keckler)



[ International Symposium on High Performance Computer Architecture (HPCA)](https://ieeexplore.ieee.org/document/7446089)









[An Analytical Model for Hardened Latch Selection and Exploration](/publication/2016-03_analytical-model-hardened-latch-selection-and-exploration)

[Michael B. Sullivan](/person/mike-sullivan), [Brian Zimmer](/person/brian-zimmer), [Siva Hari](/person/siva-hari), Timothy Tsai, [Steve Keckler](/person/stephen-keckler)



[Workshop on Silicon Errors in Logic--System Effects (SELSE)](http://www.selse.org/)









[vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design](/publication/2016-02_vdnn-virtualized-deep-neural-networks-scalable-memory-efficient-neural-network)

Minsoo Rhu, Natalia Gimelshein, [Jason Clemons](/person/jason-clemons), Arslan Zulfiqar, [Steve Keckler](/person/stephen-keckler)



[arXiv](https://arxiv.org/abs/1602.08124)









### 2015 

[Anatomy of GPU Memory System for Multi-Application Execution](/publication/2015-10_anatomy-gpu-memory-system-multi-application-execution)

Adwait Jog, Onur Kayiran, Tuba Kesten, Ashutosh Pattnaik, Evgeny Bolotin, [Niladrish Chatterjee](/person/niladrish-chatterjee), [Steve Keckler](/person/stephen-keckler), Mahmut T. Kandemir, Chita R. Das



[International Symposium on Memory Systems (MEMSYS)](http://dl.acm.org/citation.cfm?id=2818979)









[GPU Computing Pipeline Inefficiencies and Optimization Opportunities in Heterogeneous CPU-GPU Processors](/publication/2015-10_gpu-computing-pipeline-inefficiencies-and-optimization-opportunities)

Joel Hestness, [Steve Keckler](/person/stephen-keckler), David A. Wood



[International Symposium on Workload Characterization (IISWC)](https://ieeexplore.ieee.org/document/7314150)









[Designing Efficient Heterogeneous Memory Architectures](/index.php/publication/2015-08_designing-efficient-heterogeneous-memory-architectures)

Evgeny Bolotin, [David Nellans](/index.php/person/david-nellans), Oreste Villa, [Mike O'Connor](/index.php/person/mike-o-connor), Alex Ramirez, [Steve Keckler](/index.php/person/stephen-keckler), [Mike O'Connor](/index.php/person/mike-o-connor)



[IEEE Micro](https://ieeexplore.ieee.org/document/7155441)









[Flexible Software Profiling of GPU Architectures](/publication/2015-06_flexible-software-profiling-gpu-architectures)

[Mark Stephenson](/person/mark-stephenson), [Siva Hari](/person/siva-hari), Yunsup Lee, Eiman Ebrahimi, Daniel Johnson, [David Nellans](/person/david-nellans), [Mike O'Connor](/person/mike-o-connor), [Steve Keckler](/person/stephen-keckler)



[International Symposium on Computer Architecture (ISCA)](https://dl.acm.org/doi/10.1145/2749469.2750375)









[A Variable Warp Size Architecture](/publication/2015-06_variable-warp-size-architecture)

Timothy Rogers, Daniel Johnson, [Mike O'Connor](/person/mike-o-connor), [Steve Keckler](/person/stephen-keckler)



[International Symposium on Computer Architecture (ISCA)](https://dl.acm.org/doi/10.1145/2749469.2750410)









[Toggle-aware Compression for GPUs](/publication/2015-05_toggle-aware-compression-gpus)

Gennady Pekhimenko, Evgeny Bolotin, [Mike O'Connor](/person/mike-o-connor), Onur Mutlu, Todd C. Mowry, [Steve Keckler](/person/stephen-keckler)



[IEEE Computer Architecture Letters ( Volume: 14, Issue: 2, July-Dec. 1 2015 )](http://ieeexplore.ieee.org/document/7103282/)









[SASSIFI: Evaluating Resilience of GPU Applications](/publication/2015-03_sassifi-evaluating-resilience-gpu-applications)

[Siva Hari](/person/siva-hari), Timothy Tsai, [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler), [Joel Emer](/person/joel-emer)



[Workshop on Silicon Errors in Logic - System Effects (SELSE-11)](https://selse.org/previous-workshops/2017-archive-2/2015-program/)









[Page Placement Strategies for GPUs within Heterogeneous Memory Systems](/publication/2015-03_page-placement-strategies-gpus-within-heterogeneous-memory-systems)

Neha Agarwal, [David Nellans](/person/david-nellans), [Mark Stephenson](/person/mark-stephenson), [Mike O'Connor](/person/mike-o-connor), [Steve Keckler](/person/stephen-keckler)



[International Conference on Architectural Support for Programming Languages and…](http://dl.acm.org/citation.cfm?id=2694381)









[Priority-Based Cache Allocation in Throughput Processors](/publication/2015-02_priority-based-cache-allocation-throughput-processors)

Dong Li, Minsoo Rhu, Daniel Johnson, [Mike O'Connor](/person/mike-o-connor), Mattan Erez, Donald Fussell, [Steve Keckler](/person/stephen-keckler)



[International Symposium on High Performance Computer Architecture (HPCA)](http://ieeexplore.ieee.org/document/7056024/)









[Unlocking Bandwidth for GPUs in CC-NUMA systems](/publication/2015-02_unlocking-bandwidth-gpus-cc-numa-systems)

Neha Agarwal, [David Nellans](/person/david-nellans), [Mike O'Connor](/person/mike-o-connor), [Steve Keckler](/person/stephen-keckler), Thomas Wenisch



[International Symposium on High Performance Computer Architecture (HPCA)](http://ieeexplore.ieee.org/document/7056046/)









### 2014 

[Arbitrary Modulus Indexing](/publication/2014-12_arbitrary-modulus-indexing)

Jeffrey R. Diamond, Donald S. Fussell, [Steve Keckler](/person/stephen-keckler)



[International Symposium on Microarchitecture (MICRO)](https://ieeexplore.ieee.org/document/7011384)









[Exploring the Design Space of SPMD Divergence Management on Data-Parallel Architectures](/publication/2014-12_exploring-design-space-spmd-divergence-management-data-parallel-architectures)

Yunsup Lee, Vinod Grover, Ronny Krashinsky, [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler), Krste Asanovic



[International Symposium on Microarchitecture (MICRO)](https://doi.org/10.1109/MICRO.2014.48)









[Scaling the Power Wall: A Path to Exascale](/publication/2014-11_scaling-power-wall-path-exascale)

Oreste Villa, Daniel Johnson, [Mike O'Connor](/person/mike-o-connor), Evgeny Bolotin, [David Nellans](/person/david-nellans), Justin Luitjens, Nikolai Sakharnykh, Peng Wang, Paulius Micikevicius, Anthony Scudiero, [Steve Keckler](/person/stephen-keckler), [William Dally](/person/william-dally)



[SC '14](http://ieeexplore.ieee.org/abstract/document/7013055/)









[A Comparative Analysis of Microarchitecture Effects on CPU and GPU Memory System Behavior](/publication/2014-10_comparative-analysis-microarchitecture-effects-cpu-and-gpu-memory-system)

Joel Hestness, [Steve Keckler](/person/stephen-keckler), David A. Wood



[International Symposium on Workload Characterization (IISWC)](https://ieeexplore.ieee.org/document/6983054)









[Measuring the Radiation Reliability of SRAM Structures in GPUs Designed for HPC](/publication/2014-04_measuring-radiation-reliability-sram-structures-gpus-designed-hpc)

Paolo Rech, Luigi Carro, Nicholas Wang, Timothy Tsai, [Siva Hari](/person/siva-hari), [Steve Keckler](/person/stephen-keckler)



[Workshop on Silicon Errors in Logic - System Effects (SELSE-10)](https://selse.org)









[Application-aware Memory System for Fair and Efficient Execution of Concurrent GPGPU Applications](/index.php/publication/2014-03_application-aware-memory-system-fair-and-efficient-execution-concurrent-gpgpu)

Adwait Jog, Evgeny Bolotin, Zvika Guz, Mike Parker, [Steve Keckler](/index.php/person/stephen-keckler), Mahmut T. Kandemir, Chita R. Das



[Workshop on General Purpose Processing Using GPUs (GPGPU-7)](http://dl.acm.org/citation.cfm?id=2576780)









### 2013 

[21st Century Digital Design Tools](/publication/2013-05_21st-century-digital-design-tools)

[William Dally](/person/william-dally), Chris Malachosky, [Steve Keckler](/person/stephen-keckler)



[Design Automation Conference (DAC)](https://ieeexplore.ieee.org/document/6560687)









[Convergence and Scalarization for Data-Parallel Architectures](/publication/2013-02_convergence-and-scalarization-data-parallel-architectures)

Yunsup Lee, Ronny Krashinsky, Vinod Grover, [Steve Keckler](/person/stephen-keckler), Krste Asanovic



[International Symposium on Code Generation and Optimization (CGO)](https://ieeexplore.ieee.org/document/6494995)









### 2012 

[Unifying Primary Cache, Scratch, and Register File Memories in a Throughput Processor](/publication/2012-12_unifying-primary-cache-scratch-and-register-file-memories-throughput-processor)

Mark Gebhart, [Steve Keckler](/person/stephen-keckler), [Brucek Khailany](/person/brucek-khailany), Ronny Krashinsky, [William Dally](/person/william-dally)



[International Symposium on Microarchitecture (MICRO)](http://dl.acm.org/citation.cfm?id=2457489)









[A Hierarchical Thread Scheduler and Register File for Energy-Efficient Throughput Processors](/publication/2012-04_hierarchical-thread-scheduler-and-register-file-energy-efficient-throughput)

Mark Gebhart, Daniel R. Johnson, David Tarjan, [Steve Keckler](/person/stephen-keckler), [William Dally](/person/william-dally), Erik Lindholm, Kevin Skadron



[ACM Transactions on Computer Systems (TOCS)](http://dl.acm.org/citation.cfm?id=2166882)









### 2011 

[A Compile-Time Managed Multi-Level Register File Hierarchy](/publication/2011-12_compile-time-managed-multi-level-register-file-hierarchy)

Mark Gebhart, [Steve Keckler](/person/stephen-keckler), [William Dally](/person/william-dally)



[International Symposium on Microarchitecture (MICRO)](https://ieeexplore.ieee.org/document/7851495)









[GPUs and the Future of Parallel Computing](/publication/2011-09_gpus-and-future-parallel-computing)

[Steve Keckler](/person/stephen-keckler), [William Dally](/person/william-dally), [Brucek Khailany](/person/brucek-khailany), [Michael Garland](/person/michael-garland), David Glasco



[IEEE Micro](http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6045685&tag=1)









[Energy-efficient Mechanisms for Managing Thread Context in Throughput Processors](/publication/2011-06_energy-efficient-mechanisms-managing-thread-context-throughput-processors)

Mark Gebhart, Daniel R. Johnson, David Tarjan, [Steve Keckler](/person/stephen-keckler), [William Dally](/person/william-dally), Erik Lindholm, Kevin Skadron



[International Symposium on Computer Architecture (ISCA)](https://dl.acm.org/doi/10.1145/2000064.2000093)