Monte Carlo Gradient Quantization

We propose Monte Carlo methods to leverage both sparsity and quantization to compress gradients of neural networks throughout training. On top of reducing the communication exchanged between multiple workers in a distributed setting, we also improve the computational efficiency of each worker. Our method, called Monte Carlo Gradient Quantization (MCGQ), shows faster convergence and higher performance than existing quantization methods on image classification and language modeling. Using both low-bit-width-quantization and high sparsity levels, our method more than doubles the rates of existing compression methods from 200x to 520x and 462x to more than 1200x on different language modeling tasks.

Accepted to CVPR2020 workshop

Authors

Goncalo Mordido (Hasso Plattner Institute)

Matthijs Van keirsbilck

Alex Keller

Publication Date

Sunday, June 14, 2020

Published in

CVPR2020 Workshop

Research Area

Algorithms and Numerical Methods

Artificial Intelligence and Machine Learning

External Links

IEEE link

Copyright

This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org.