Mark Stephenson  

 
  ![](/sites/default/files/person/research-photo.jpg)

  
 Mark Stephenson joined NVIDIA in February 2014. He is primarily interested in program analysis, code generation, and architecture. Before joining NVIDIA, Mark spent time at IBM Research, and the Massachusetts Institute of Technology, where he earned his PhD in 2006.

See Mark's [external page](https://sites.google.com/site/markwstephenson/) for additional information.


   Research Area(s)

[Computer Architecture](/index.php/research-area/computer-architecture)

[Computer Graphics](/index.php/research-area/computer-graphics)

[Programming Languages, Systems and Tools](/index.php/research-area/programming-languages-systems)

 
 Main Field of Interest

[Computer Architecture](/index.php/research-area/computer-architecture)

 
 ### Publications

 
### 2023 

[cuCatch: A Debugging Tool for Efficiently Catching Memory Safety Violations in CUDA Applications](/publication/2023-06_cucatch-debugging-tool-efficiently-catching-memory-safety-violations-cuda)

[Mohamed Tarek Ibn Ziad](/person/mohamed-tarek-ibn-ziad), [Sana Damani](/person/sana-damani), [Aamer Jaleel](/person/aamer-jaleel), [Stephen W. Keckler](/person/stephen-keckler), [Mark Stephenson](/person/mark-stephenson)


[ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)](https://dl.acm.org/doi/10.1145/3591225)


### 2022 

[GPU Subwarp Interleaving](/publication/2022-01_gpu-subwarp-interleaving)

Sana Damani, [Mark Stephenson](/person/mark-stephenson), Ram Rangan, Daniel Johnson, Rishkul Kulkarni, [Steve Keckler](/person/stephen-keckler)


[International Symposium on High-Performance Computer Architecture (HPCA)](https://ieeexplore.ieee.org/document/9773183)


### 2021 

[Cooperative Profile Guided Optimization](/publication/2021-07_cooperative-profile-guided-optimization)

[Mark Stephenson](/person/mark-stephenson), Ram Rangan, [Steve Keckler](/person/stephen-keckler)


[ Computer Graphics Forum (Proceedings of High Performance Graphics)](https://www.highperformancegraphics.org/2021/)


[PGZ: Automatic Zero-Value Code Specialization](/index.php/publication/2021-03_pgz-automatic-zero-value-code-specialization)

[Mark Stephenson](/index.php/person/mark-stephenson), Ram Rangan


[International Conference on Compiler Construction (CC)](https://dl.acm.org/doi/10.1145/3446804.3446845)


### 2020 

[Zeroploit: Exploiting Zero Valued Operands in Interactive Gaming Applications](/publication/2020-08_zeroploit-exploiting-zero-valued-operands-interactive-gaming-applications)

Ram Rangan, [Mark Stephenson](/person/mark-stephenson), Aditya Ukarande, Shyam Murthy, Virat Agarwal, Marc Blackstein


[ACM Transactions on Architecture and Code Optimization (TACO)](https://dl.acm.org/doi/10.1145/3394284)


[Estimating Silent Data Corruption Rates Using a Two-Level Model](/publication/2020-04_estimating-silent-data-corruption-rates-using-two-level-model)

[Siva Hari](/person/siva-hari), Paolo Rech, Timothy Tsai, [Mark Stephenson](/person/mark-stephenson), Arslan Zulfiqar, [Michael B. Sullivan](/person/mike-sullivan), Philip Shirvani, Paul Racunas, [Joel Emer](/person/joel-emer), [Steve Keckler](/person/stephen-keckler)


[arXiv](https://arxiv.org/abs/2005.01445)


[Speculative Reconvergence for Improved SIMT Efficiency](/index.php/publication/2020-02_speculative-reconvergence-improved-simt-efficiency)

Sana Damani, Daniel Johnson, [Mark Stephenson](/index.php/person/mark-stephenson), Eddie Yan, Olivier Giroux, Michael McKeown, [Steve Keckler](/index.php/person/stephen-keckler)


[International Symposium on Code Generation and Optimization](https://dl.acm.org/doi/10.1145/3368826.3377911)


### 2019 

[NVBit: A Dynamic Binary Instrumentation Framework for NVIDIA GPUs](/publication/2019-10_nvbit-dynamic-binary-instrumentation-framework-nvidia-gpus)

Oreste Villa, [Mark Stephenson](/person/mark-stephenson), [David Nellans](/person/david-nellans), [Steve Keckler](/person/stephen-keckler)


[International Symposium on Microarchitecture (MICRO)](https://doi.org/10.1145/3352460.3358307)


### 2018 

[Exposing Memory Access Patterns to Improve Instruction and Memory Efficiency in GPUs](/publication/2018-10_exposing-memory-access-patterns-improve-instruction-and-memory-efficiency-gpus)

[Neal Crago](/person/neal-crago), [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler)


[ACM Transactions on Architecture and Code Optimization (TACO)](https://doi.org/10.1145/3280851)


[Software-Directed Techniques for Improved GPU Register File Utilization](/index.php/publication/2018-09_software-directed-techniques-improved-gpu-register-file-utilization)

Dani Voitsechov, Arslan Zulfiqar, [Mark Stephenson](/index.php/person/mark-stephenson), Mark Gebhart, [Steve Keckler](/index.php/person/stephen-keckler)


[ACM Transactions on Architecture and Code Optimization (TACO)](https://dl.acm.org/doi/10.1145/3243905)


### 2017 

[SASSIFI: An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluation](/publication/2017-04_sassifi-architecture-level-fault-injection-tool-gpu-application-resilience)

[Siva Hari](/person/siva-hari), Timothy Tsai, [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler), [Joel Emer](/person/joel-emer)


[International Symposium on Performance Analysis of Systems and Software (ISPASS)](https://ieeexplore.ieee.org/document/7975296)


### 2016 

[Automatically Exploiting Implicit Pipeline Parallelism from Multiple Dependent Kernels for GPUs](/index.php/publication/2016-09_automatically-exploiting-implicit-pipeline-parallelism-multiple-dependent)

Gwangsun Kim, Jiyun Jeong, John Kim, [Mark Stephenson](/index.php/person/mark-stephenson)


[International Conference on Parallel Architectures and Compilation (PACT)](https://dl.acm.org/doi/proceedings/10.1145/2967938)


[Towards High Performance Paged Memory for GPUs](/publication/2016-03_towards-high-performance-paged-memory-gpus)

Tianhao Zheng, [David Nellans](/person/david-nellans), Arslan Zulfiqar, [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler)


[International Symposium on High Performance Computer Architecture (HPCA)](https://ieeexplore.ieee.org/document/7446077)


### 2015 

[Flexible Software Profiling of GPU Architectures](/publication/2015-06_flexible-software-profiling-gpu-architectures)

[Mark Stephenson](/person/mark-stephenson), [Siva Hari](/person/siva-hari), Yunsup Lee, Eiman Ebrahimi, Daniel Johnson, [David Nellans](/person/david-nellans), [Mike O'Connor](/person/mike-o-connor), [Steve Keckler](/person/stephen-keckler)


[International Symposium on Computer Architecture (ISCA)](https://dl.acm.org/doi/10.1145/2749469.2750375)


[SASSIFI: Evaluating Resilience of GPU Applications](/publication/2015-03_sassifi-evaluating-resilience-gpu-applications)

[Siva Hari](/person/siva-hari), Timothy Tsai, [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler), [Joel Emer](/person/joel-emer)


[Workshop on Silicon Errors in Logic - System Effects (SELSE-11)](https://selse.org/previous-workshops/2017-archive-2/2015-program/)


[Page Placement Strategies for GPUs within Heterogeneous Memory Systems](/publication/2015-03_page-placement-strategies-gpus-within-heterogeneous-memory-systems)

Neha Agarwal, [David Nellans](/person/david-nellans), [Mark Stephenson](/person/mark-stephenson), [Mike O'Connor](/person/mike-o-connor), [Steve Keckler](/person/stephen-keckler)


[International Conference on Architectural Support for Programming Languages and…](http://dl.acm.org/citation.cfm?id=2694381)


### 2014 

[Exploring the Design Space of SPMD Divergence Management on Data-Parallel Architectures](/publication/2014-12_exploring-design-space-spmd-divergence-management-data-parallel-architectures)

Yunsup Lee, Vinod Grover, Ronny Krashinsky, [Mark Stephenson](/person/mark-stephenson), [Steve Keckler](/person/stephen-keckler), Krste Asanovic


[International Symposium on Microarchitecture (MICRO)](https://doi.org/10.1109/MICRO.2014.48)