
Placement is an important step in modern very-large-scale integrated (VLSI) designs. Detailed placement is a placement refining procedure intensively called throughout the design flow, thus its efficiency has a vital impact on design closure. However, since most detailed placement techniques are inherently greedy and sequential, they are generally difficult to parallelize. In this work, we present a concurrent detailed placement framework, ABCDPlace, exploiting multithreading and GPU acceleration. We propose batch-based concurrent algorithms for widely-adopted sequential detailed placement techniques, such as independent set matching, global swap, and local reordering. Experimental results demonstrate that ABCDPlace can achieve 2×-5× faster runtime than sequential implementations with multi-threaded CPU and over 10× with GPU on ISPD 2005 contest benchmarks without quality degradation. On larger industrial benchmarks, we show more than 16× speedup with GPU over the state-of-the-art sequential detailed placer. ABCDPlace finishes the detailed placement of a 10-million-cell industrial design in one minute.
This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org.