  Joel Emer  

 



  ![](/sites/default/files/person/joel-emer.jpg)

  

 Dr. Joel S. Emer joined NVIDIA in 2014 and is a member of the Architecture Research group. He is also a Professor of the Practice at MIT. He is responsible for exploration of future architectures as well as modeling and analysis methodologies. Prior to joining NVIDIA he worked at Intel where he was an Intel Fellow and Director of Microarchitecture Research. Previously he worked at Compaq and Digital Equipment Corporation.

Emer has held various research and advanced development positions investigating processor microarchitecture and developing performance modeling and evaluation techniques. He has made architectural contributions to a number of VAX, Alpha and X86 processors and is recognized as one of the developers of the widely employed quantitative approach to processor performance evaluation. More recently, he has been recognized for his contributions in the advancement of simultaneous multithreading technology, processor reliability analysis, cache organization and spatial architectures.

Emer received a bachelor's degree with highest honors in electrical engineering in 1974, and his master's degree in 1975 -- both from Purdue University. He earned a doctorate in electrical engineering from the University of Illinois in 1979. Emer has received numerous public recognitions, including being named a Fellow of both the ACM and IEEE, and he was the 2009 recipient of the Eckert-Mauchly award for lifetime contributions in computer architecture.



   

 

 



 ### Publications

 

### 2022 

[Sparseloop: An Analytical Approach to Sparse Tensor Accelerator Modeling](/publication/2022-10_sparseloop-analytical-approach-sparse-tensor-accelerator-modeling)

Yannan Nellie Wu, [Po-An Tsai](/person/po-an-tsai), [Angshuman Parashar](/person/angshuman-parashar), Vivienne Sze, [Joel Emer](/person/joel-emer)



[International Symposium on Microarchitecture (MICRO)](https://ieeexplore.ieee.org/document/9923807)



Distinguished Artifact award





[Ruby: Improving Hardware Efficiency for Tensor Algebra Accelerators Through Imperfect Factorization](/publication/2022-06_ruby-improving-hardware-efficiency-tensor-algebra-accelerators-through)

Mark Horeni, Pooria Taheri, [Po-An Tsai](/person/po-an-tsai), [Angshuman Parashar](/person/angshuman-parashar), [Joel Emer](/person/joel-emer), Siddharth Joshi



[International Symposium on Performance Analysis of Systems and Software (ISPASS)](https://ieeexplore.ieee.org/document/9804679)









[DAGguise: Mitigating Memory Timing Side Channels](/publication/2022-02_dagguise-mitigating-memory-timing-side-channels)

Peter W. Deutsch, Yuheng Yang, Thomas Bourgeat, Jules Drean, [Joel Emer](/person/joel-emer), Mengjia Yan



[International Conference on Architectural Support for Programming Languages and…](https://dl.acm.org/doi/10.1145/3503222.3507747)









### 2021 

[SpZip: Architectural Support for Effective Data Compression in Irregular Applications](/publication/2021-06_spzip-architectural-support-effective-data-compression-irregular-applications)

Yifan Yang, [Joel Emer](/person/joel-emer), Daniel Sanchez



[International Symposium on Computer Architecture (ISCA)](https://ieeexplore.ieee.org/document/9499902)









[Simba: scaling deep-learning inference with chiplet-based architecture](/publication/2021-05_simba-scaling-deep-learning-inference-chiplet-based-architecture)

Yakun Sophia Shao, [Jason Clemons](/person/jason-clemons), [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Brian Zimmer](/person/brian-zimmer), [Matt Fojtik](/person/matt-fojtik), [Ted Jiang](/person/ted-jiang), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Stephen Tell](/person/stephen-tell), [Yanqing Zhang](/person/yanqing-zhang), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Tom Gray](/person/tom-gray), [Brucek Khailany](/person/brucek-khailany), [Steve Keckler](/person/stephen-keckler)



[Communications of the ACM](https://dl.acm.org/doi/10.1145/3460227)



ACM Research Highlight





[Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators](/publication/2021-04_sparseloop-analytical-energy-focused-design-space-exploration-methodology)

Yannan Nellie Wu, [Po-An Tsai](/person/po-an-tsai), [Angshuman Parashar](/person/angshuman-parashar), Vivienne Sze, [Joel Emer](/person/joel-emer)



[International Symposium on Performance Analysis of Systems and Software (ISPASS)](https://ieeexplore.ieee.org/document/9408213)









[GAMMA: Exploiting Gustavson’s Algorithm to Accelerate Sparse Matrix Multiplication](/publication/2021-04_gamma-exploiting-gustavson-s-algorithm-accelerate-sparse-matrix-multiplication)

Guowei Zhang, Nithya Attaluri, [Joel Emer](/person/joel-emer), Daniel Sanchez



[International Conference on Architectural Support for Programming Languages and…](https://dl.acm.org/doi/10.1145/3445814.3446702)









### 2020 

[CaSA: End-to-end Quantitative Security Analysis of Randomly Mapped Caches](/publication/2020-10_casa-end-end-quantitative-security-analysis-randomly-mapped-caches)

Thomas Bourgeat, Jules Drean, Yuheng Yang, Lillian Tsai, [Joel Emer](/person/joel-emer), Mengjia Yan



[International Symposium on Microarchitecture (MICRO)](https://ieeexplore.ieee.org/document/9251961)









[How to Evaluate Deep Neural Network Processors: TOPS/W (Alone) Considered Harmful](/publication/2020-08_how-evaluate-deep-neural-network-processors-topsw-alone-considered-harmful)

Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, [Joel Emer](/person/joel-emer)



[IEEE Solid-State Circuits Magazine](https://ieeexplore.ieee.org/document/9177369)









[There’s Plenty of Room at the Top: What Will Drive Computer Performance after Moore’s Law?](/index.php/publication/2020-06_there-s-plenty-room-top-what-will-drive-computer-performance-after-moore-s-law)

Charles E. Leiserson, Neil C. Thompson, [Joel Emer](/index.php/person/joel-emer), Bradley C. Kuszmaul, Butler W. Lampson, Daniel Sanchez , Tao B. Schardl 



[Science](https://www.science.org/doi/10.1126/science.aam9744)









[Estimating Silent Data Corruption Rates Using a Two-Level Model](/publication/2020-04_estimating-silent-data-corruption-rates-using-two-level-model)

[Siva Hari](/person/siva-hari), Paolo Rech, Timothy Tsai, [Mark Stephenson](/person/mark-stephenson), Arslan Zulfiqar, [Michael B. Sullivan](/person/mike-sullivan), Philip Shirvani, Paul Racunas, [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler)



[arXiv](https://arxiv.org/abs/2005.01445)









[A 0.32–128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Network Inference Accelerator With Ground-Referenced Signaling in 16 nm](/publication/2020-01_032-128-tops-scalable-multi-chip-module-based-deep-neural-network-inference)

[Brian Zimmer](/person/brian-zimmer), [Rangharajan Venkatesan](/person/rangharajan-venkatesan), Yakun Sophia Shao, [Jason Clemons](/person/jason-clemons), [Matt Fojtik](/person/matt-fojtik), [Ted Jiang](/person/ted-jiang), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Stephen Tell](/person/stephen-tell), [Yanqing Zhang](/person/yanqing-zhang), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Tom Gray](/person/tom-gray), [Steve Keckler](/person/stephen-keckler), [Brucek Khailany](/person/brucek-khailany)



[IEEE Journal of Solid-State Circuits (JSSC)](https://ieeexplore.ieee.org/document/8959403)



JSSC 2020 Best Paper award





### 2019 

[MAGNet: A Modular Accelerator Generator for Neural Networks](/publication/2019-11_magnet-modular-accelerator-generator-neural-networks)

[Rangharajan Venkatesan](/person/rangharajan-venkatesan), Sophia Shao, Miaorong Wang, [Jason Clemons](/person/jason-clemons), [Steve Dai](/person/steve-dai), [Matt Fojtik](/person/matt-fojtik), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Yanqing Zhang](/person/yanqing-zhang), [Brian Zimmer](/person/brian-zimmer), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler), [Brucek Khailany](/person/brucek-khailany)



[International Conference On Computer Aided Design (ICCAD)](https://ieeexplore.ieee.org/document/8942127)









[Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs](/publication/2019-11_accelergy-architecture-level-energy-estimation-methodology-accelerator-designs)

Yannan Nellie Wu, [Joel Emer](/person/joel-emer), Vivienne Sze



[International Conference on Computer Aided Design (ICCAD)](https://ieeexplore.ieee.org/document/8942149)









[Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture](/publication/2019-10_simba-scaling-deep-learning-inference-multi-chip-module-based-architecture)

Sophia Shao, [Jason Clemons](/person/jason-clemons), [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Brian Zimmer](/person/brian-zimmer), [Matt Fojtik](/person/matt-fojtik), [Ted Jiang](/person/ted-jiang), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Stephen Tell](/person/stephen-tell), [Yanqing Zhang](/person/yanqing-zhang), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Tom Gray](/person/tom-gray), [Brucek Khailany](/person/brucek-khailany), [Steve Keckler](/person/stephen-keckler)



[International Symposium on Microarchitecture (MICRO)](https://dl.acm.org/doi/10.1145/3352460.3358302)



Best Paper award, IEEE Micro Top Picks in Computer Architecture (Honorable Mention)





[ExTensor: An Accelerator for Sparse Tensor Algebra](/index.php/publication/2019-10_extensor-accelerator-sparse-tensor-algebra)

Kartik Hegde, Hadi Asghari-Moghaddam, [Michael Pellauer](/index.php/person/michael-pellauer), [Neal Crago](/index.php/person/neal-crago), [Aamer Jaleel](/index.php/person/aamer-jaleel), Edgar Solomonik, [Joel Emer](/index.php/person/joel-emer), Christopher W. Fletcher



[International Symposium on Microarchitecture (MICRO)](https://dl.acm.org/doi/10.1145/3352460.3358275)



IEEE Micro Top Picks in Computer Architecture (Honorable Mention)





[A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator Designed with a High-Productivity VLSI Methodology](/publication/2019-08_011-pjop-032-128-tops-scalable-multi-chip-module-based-deep-neural-network)

[Rangharajan Venkatesan](/person/rangharajan-venkatesan), Sophia Shao, [Brian Zimmer](/person/brian-zimmer), [Jason Clemons](/person/jason-clemons), [Matt Fojtik](/person/matt-fojtik), [Ted Jiang](/person/ted-jiang), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Stephen Tell](/person/stephen-tell), [Yanqing Zhang](/person/yanqing-zhang), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Tom Gray](/person/tom-gray), [Steve Keckler](/person/stephen-keckler), [Brucek Khailany](/person/brucek-khailany)



[Hot Chips: A Symposium on High Performance Chips](http://www.hotchips.org/)









[A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator with Ground-Reference Signaling in 16nm](/publication/2019-06_011-pjop-032-128-tops-scalable-multi-chip-module-based-deep-neural-network)

[Brian Zimmer](/person/brian-zimmer), [Rangharajan Venkatesan](/person/rangharajan-venkatesan), Sophia Shao, [Jason Clemons](/person/jason-clemons), [Matt Fojtik](/person/matt-fojtik), [Ted Jiang](/person/ted-jiang), [Ben Keller](/person/ben-keller), Alicia Klinefelter, [Nathaniel Pinckney](/person/nathaniel-pinckney), Priyanka Raina, [Stephen Tell](/person/stephen-tell), [Yanqing Zhang](/person/yanqing-zhang), [William Dally](/person/william-dally), [Joel Emer](/person/joel-emer), [Tom Gray](/person/tom-gray), [Steve Keckler](/person/stephen-keckler), [Brucek Khailany](/person/brucek-khailany)



[Symposium on VLSI Circuits](https://ieeexplore.ieee.org/document/8778056)









[Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration](/publication/2019-04_buffets-efficient-and-composable-storage-idiom-explicit-decoupled-data)

[Michael Pellauer](/person/michael-pellauer), Yakun Sophia Shao, [Jason Clemons](/person/jason-clemons), [Neal Crago](/person/neal-crago), Kartik Hegde, [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Steve Keckler](/person/stephen-keckler), Christopher W. Fletcher, [Joel Emer](/person/joel-emer)



[International Conference on Architectural Support for Programming Languages and…](https://dl.acm.org/doi/10.1145/3297858.3304025)



IEEE Micro Top Picks in Computer Architecture (Honorable Mention)





[Timeloop: A Systematic Approach to DNN Accelerator Evaluation](/publication/2019-03_timeloop-systematic-approach-dnn-accelerator-evaluation)

[Angshuman Parashar](/person/angshuman-parashar), Priyanka Raina, Yakun Sophia Shao, Yu-Hsin Chen, Victor A. Ying, Anurag Mukkara, [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Brucek Khailany](/person/brucek-khailany), [Steve Keckler](/person/stephen-keckler), [Joel Emer](/person/joel-emer)



[International Symposium on Performance Analysis of Systems and Software (ISPASS)](https://ieeexplore.ieee.org/document/8695666)









### 2018 

[DAWG: A Defense Against Cache Timing Attacks in Speculative Execution Processors](/publication/2018-10_dawg-defense-against-cache-timing-attacks-speculative-execution-processors)

Vladimir Kiriansky, Ilia Lebedev, Saman Amarasinghe, Srinivas Devadas, [Joel Emer](/person/joel-emer)



[International Symposium on Microarchitecture (MICRO)](https://ieeexplore.ieee.org/document/8574600)









[Harmonizing Speculative and Non-Speculative Execution in Architectures for Ordered Parallelism](/publication/2018-10_harmonizing-speculative-and-non-speculative-execution-architectures-ordered)

Mark C. Jeffrey, Victor A. Ying, Suvinay Subramanian, Hyun Ryong Lee, [Joel Emer](/person/joel-emer), Daniel Sanchez



[International Symposium on Microarchitecture (MICRO)](https://ieeexplore.ieee.org/document/8574543)









[A Modular Digital VLSI Flow for High-Productivity SoC Design](/publication/2018-06_modular-digital-vlsi-flow-high-productivity-soc-design)

[Brucek Khailany](/person/brucek-khailany), Evgeni Krimer, [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Jason Clemons](/person/jason-clemons), [Joel Emer](/person/joel-emer), [Matt Fojtik](/person/matt-fojtik), Alicia Klinefelter, [Michael Pellauer](/person/michael-pellauer), [Nathaniel Pinckney](/person/nathaniel-pinckney), Sophia Shao, Shreesha Srinath, Christopher Torng, Sam (Likun) Xi, [Yanqing Zhang](/person/yanqing-zhang), [Brian Zimmer](/person/brian-zimmer)



[Design Automation Conference (DAC)](https://dl.acm.org/doi/10.1145/3195970.3199846)









[Stitch-X: An Accelerator Architecture for Exploiting Unstructured Sparsity in Deep Neural Networks](/publication/2018-02_stitch-x-accelerator-architecture-exploiting-unstructured-sparsity-deep-neural)

Ching-En Lee, Yakun Sophia Shao, Jie-Fang Zhang, [Angshuman Parashar](/person/angshuman-parashar), [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler), Zhengya Zhang



[SysML Conference](https://mlsys.org/Conferences/2018/index.html#posters)









### 2017 

[Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications](/publication/2017-11_understanding-error-propagation-deep-learning-neural-network-dnn-accelerators)

Guanpeng Li, [Siva Hari](/person/siva-hari), [Michael B. Sullivan](/person/mike-sullivan), Timothy Tsai, Karthik Pattabiraman, [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler)



[The International Conference for High Performance Computing, Networking, Storag…](https://dl.acm.org/doi/10.1145/3126908.3126964)









[SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks](/publication/2017-06_scnn-accelerator-compressed-sparse-convolutional-neural-networks)

[Angshuman Parashar](/person/angshuman-parashar), Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Brucek Khailany](/person/brucek-khailany), [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler), [William Dally](/person/william-dally)



[International Symposium on Computer Architecture (ISCA)](https://dl.acm.org/doi/10.1145/3079856.3080254)









[Fractal: An Execution Model for Fine-Grain Nested Speculative Parallelism](/publication/2017-06_fractal-execution-model-fine-grain-nested-speculative-parallelism)

Suvinay Subramanian, Mark C. Jeffrey, Maleen Abeydeera, Hyun Ryong Lee, Victor A. Ying, [Joel Emer](/person/joel-emer), Daniel Sanchez



[International Symposium on Computer Architecture (ISCA)](https://ieeexplore.ieee.org/document/8192504)









[SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks](/publication/2017-05_scnn-accelerator-compressed-sparse-convolutional-neural-networks)

[Angshuman Parashar](/person/angshuman-parashar), Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, [Rangharajan Venkatesan](/person/rangharajan-venkatesan), [Brucek Khailany](/person/brucek-khailany), [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler), [William Dally](/person/william-dally)



[arXiv](https://arxiv.org/abs/1708.04485)









[SASSIFI: An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluation](/publication/2017-04_sassifi-architecture-level-fault-injection-tool-gpu-application-resilience)

[Siva Hari](/person/siva-hari), Timothy Tsai, [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler), [Joel Emer](/person/joel-emer)



[International Symposium on Performance Analysis of Systems and Software (ISPASS)](https://ieeexplore.ieee.org/document/7975296)









### 2016 

[Data-Centric Execution of Speculative Parallel Programs](/publication/2016-10_data-centric-execution-speculative-parallel-programs)

Mark C. Jeffrey, Suvinay Subramanian, Maleen Abeydeera, [Joel Emer](/person/joel-emer), Daniel Sanchez



[International Symposium on Microarchitecture (MICRO)](https://ieeexplore.ieee.org/document/7783708)









[CLARA: Circular Linked-List Auto- and Self-Refresh Architecture](/publication/2016-10_clara-circular-linked-list-auto-and-self-refresh-architecture)

Aditya Agrawal, [Mike O'Connor](/person/mike-o-connor), Evgeny Bolotin, [Niladrish Chatterjee](/person/niladrish-chatterjee), [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler)



[International Symposium on Memory Systems (MEMSYS'16)](https://dl.acm.org/doi/10.1145/2989081.2989084)









[Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks](/publication/2016-06_eyeriss-spatial-architecture-energy-efficient-dataflow-convolutional-neural)

Yu-Hsin Chen, [Joel Emer](/person/joel-emer), Vivienne Sze



[International Symposium on Computer Architecture (ISCA)](https://ieeexplore.ieee.org/document/7551407)









### 2015 

[A Fast and Accurate Analytical Technique to Compute the AVF of Sequential Bits in a Processor](/publication/2015-12_fast-and-accurate-analytical-technique-compute-avf-sequential-bits-processor)

Steve Raasch, Arijis Biswas, Jon Stephan, Paul Racunas, [Joel Emer](/person/joel-emer)



[International Symposium on Microarchitecture (MICRO)](https://ieeexplore.ieee.org/abstract/document/7856641)









[A Scalable Architecture for Ordered Parallelism](/publication/2015-12_scalable-architecture-ordered-parallelism)

Mark C. Jeffery, Suvinay Subramanian, Cong Yang, [Joel Emer](/person/joel-emer), Daniel Sanchez



[International Symposium on Microarchitecture (MICRO)](https://ieeexplore.ieee.org/document/7856601)









[Scavenger: Automating the Construction of Application-Optimized Memory Hierarchies](/publication/2015-09_scavenger-automating-construction-application-optimized-memory-hierarchies)

Hsin-Jung Yang, Kermin Fleming, Michael Adler, Felix Winterstein, [Joel Emer](/person/joel-emer)



[International Conference on Field Programmable Logic and Applications (FPL)](https://ieeexplore.ieee.org/abstract/document/7294018)









[Efficient Control and Communication Paradigms for Coarse-Grained Spatial Architectures](/publication/2015-09_efficient-control-and-communication-paradigms-coarse-grained-spatial)

[Michael Pellauer](/person/michael-pellauer), [Angshuman Parashar](/person/angshuman-parashar), Michael Adler, Bushra Ahsan, Randy Almon, [Neal Crago](/person/neal-crago), Kermin Fleming, Mohit Gambhir, [Aamer Jaleel](/person/aamer-jaleel), Tushar Krishna, [Daniel Lustig](/person/daniel-lustig), Stephen Maresh, Vladimir Pavlov, Rachid Rayess, Antonia Zhai, [Joel Emer](/person/joel-emer)



[ACM Transactions on Computing Systems (TOCS)](https://dl.acm.org/doi/10.1145/2754930)









[SASSIFI: Evaluating Resilience of GPU Applications](/publication/2015-03_sassifi-evaluating-resilience-gpu-applications)

[Siva Hari](/person/siva-hari), Timothy Tsai, [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler), [Joel Emer](/person/joel-emer)



[Workshop on Silicon Errors in Logic - System Effects (SELSE-11)](https://selse.org/previous-workshops/2017-archive-2/2015-program/)









[High Performing Cache Hierarchies for Server Workloads -- Relaxing Inclusion to Capture the Latency Benefits of Exclusive Caches](/publication/2015-02_high-performing-cache-hierarchies-server-workloads-relaxing-inclusion-capture)

[Aamer Jaleel](/person/aamer-jaleel), Joseph Nuzman, Adrian Moga, Simon C. Steely Jr., [Joel Emer](/person/joel-emer)



[International Symposium on High Performance Computer Architecture (HPCA)](https://ieeexplore.ieee.org/document/7056045)