Efficient, high-quality Bayer demosaic filtering on GPUs

This paper describes a series of optimizations for implementing the high-quality Malvar-He-Cutler Bayer demosaicing filter on a GPU in OpenGL. Applying this filter is the first step in most video processing pipelines, but is generally considered too slow for real-time on a CPU. The optimized implementation contains 66% fewer ALU operations than a direct GPU implementation and can filter 40 simultaneous HD 1080p video streams at 30 fps (2728 Mpix/s) on current hardware. It is 2-3 times faster than a straightforward GPU implementation of the same algorithm on many GPUs. Most of the optimizations are applicable to other kinds of processors that support SIMD instructions, like CPUs and DSPs.

Morgan McGuire (Williams College)
Publication Date: 
Tuesday, September 1, 2009