1. [Publications](/publications)
2. A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator Designed with a High-Productivity VLSI Methodology
 
 # A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator Designed with a High-Productivity VLSI Methodology

  ![](/sites/default/files/styles/wide/public/publications/RC18_photo.jpg?itok=LTIpdn_y)

 This work presents a scalable deep neural network (DNN) inference accelerator consisting of 36 small chips connected in a mesh network on a multi-chip-module (MCM). The accelerator enables flexible scaling for efficient inference on a wide range of DNNs, from mobile to data center domains. The testchip was implemented using a novel high-productivity VLSI methodology, fully designed in C++ using High-Level Synthesis (HLS) tools and leveraged an agile VLSI design flow. The 6 mm^2 chip was implemented in 16nm technology and achieves 1.29 TOPS/mm^2, 0.11 pJ/op energy efficiency, 4 TOPS (8b int) peak performance on 1 chip, and 128 peak TOPS and 2,615 images/s ResNet-50 inference in a 36-chip MCM.



 ## Authors



[Rangharajan Venkatesan](/person/rangharajan-venkatesan)

Sophia Shao (NVIDIA)

[Brian Zimmer](/person/brian-zimmer)

[Jason Clemons](/person/jason-clemons)

[Matt Fojtik](/person/matt-fojtik)

[Ted Jiang](/person/ted-jiang)

[Ben Keller](/person/ben-keller)

Alicia Klinefelter (NVIDIA)

[Nathaniel Pinckney](/person/nathaniel-pinckney)

Priyanka Raina (Stanford)

[Stephen Tell](/person/stephen-tell)

[Yanqing Zhang](/person/yanqing-zhang)

[William Dally](/person/william-dally)

[Joel Emer](/person/joel-emer)

[Tom Gray](/person/tom-gray)

[Steve Keckler](/person/stephen-keckler)

[Brucek Khailany](/person/brucek-khailany)

 

 

 ## Publication Date



Tuesday, August 20, 2019

 

 ## Published in



[Hot Chips: A Symposium on High Performance Chips](http://www.hotchips.org/)

 

 ## Research Area



[Circuits and VLSI Design](/research-area/circuits)

[Computer Architecture](/research-area/computer-architecture)

 

 

 ## Uploaded Files



[Published slides](https://research.nvidia.com/sites/default/files/pubs/2019-08_A-0.11-pJ/Op%2C//HotChips_RC18_final.pdf "Open file in new window")1.95 MB