In this chapter, I will explain how C++ metaprogramming techniques can be used to simplify the development of CUDA-based libraries. The chapter will walk through the design of a high-level library to process arrays in CUDA. For illustrative purposes, I will describe this library in the context of numerical solutions of partial differential equations. The basic technique can be useful in a wide variety of application domains such as image processing, agent modeling, particle systems, or other data-parallel routines. Source code for this chapter can be found online at http://code.google.com/p/cuda-metaprog.