1. [Publications](/publications)
2. High Performance and Scalable GPU Graph Traversal
 
 # High Performance and Scalable GPU Graph Traversal

  ![](/sites/default/files/styles/wide/public/pubs/2011-08_High-Performance-and/default.jpg?itok=OTJnhVun)

 Breadth-first search (BFS) is a core primitive for graph traversal and a basis for many higher-level graph analysis algorithms. It is also representative of a class of parallel computations whose memory accesses and work distribution are both irregular and data-dependent. Recent work has demonstrated the plausibility of GPU sparse graph traversal, but has tended to focus on asymptotically inefficient algorithms that perform poorly on graphs with non-trivial diameter.

We present a BFS parallelization focused on fine-grained task management that achieves an asymptotically optimal O(|V|+|E|) work complexity. Our implementation delivers excellent performance on diverse graphs, achieving traversal rates in excess of 3.3 billion and 8.3 billion traversed edges per second using single and quad-GPU configurations, respectively. This level of performance is several times faster than state-of-the-art implementations both CPU and GPU platforms.



 ## Authors



Duane Merrill (Unversity of Virginia)

[Michael Garland](/person/michael-garland)

Andrew Grimshaw (University of Virginia)

 

 

 ## Publication Date



Monday, August 1, 2011

 

 ## Published in



[Technical Report CS-2011-05, Department of Computer Science, University of Virg…](http://www.cs.virginia.edu)

 

 ## Research Area



[High Performance Computing](/research-area/high-performance-computing)

 

 

 ## External Links



[Tech Report](https://sites.google.com/site/duanemerrill/BFS_TR.pdf)

 

 

 ## Uploaded Files



[BFS TR.pdf](https://research.nvidia.com/sites/default/files/pubs/2011-08_High-Performance-and/BFS%20TR.pdf "Open file in new window")1.32 MB