CMU Campus
Center for                           Nonlinear Analysis
CNA Home People Seminars Publications Workshops and Conferences CNA Working Groups CNA Comments Form Summer Schools Summer Undergraduate Institute PIRE Cooperation Graduate Topics Courses SIAM Chapter Seminar Positions Contact
CNA Seminar/Colloquium/Joint Pitt-CNA Colloquium

Jack Xin
UC Irvine
Title: Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks

Abstract: Quantized deep neural networks (QDNNs) are attractive due to their much lower memory storage and faster inference speed than their regular full precision counterparts. To maintain the same performance level especially at low bit-widths, QDNNs must be retrained. Their training involves minimizing a piecewise constant non-convex objective in high dimension subject to a discrete constraint, hence mathematical challenges arise. We introduce the notion of coarse derivative and propose the blended coarse gradient descent (BCGD) algorithm. Coarse gradient is generally not a gradient of any function but an artificial descent direction. The network weight update of BCGD goes by coarse gradient correction of an average of the full precision weights and their quantization (the so-called blending), which yields sufficient descent in the objective value and accelerates the training. Our experiments demonstrate that this simple blending technique is very effective for quantization at extremely low bit-width such as binarization. For theoretical understanding, we show convergence analysis of coarse gradient descent on a two-layer neural network model with Gaussian input data, and prove that the expected coarse gradient correlates positively with the underlying true gradient.

Recording: http://mm.math.cmu.edu/recordings/cna/Jackxin_small.mp4
Date: Thursday, August 30, 2018
Time: 1:30 pm
Location: Wean Hall 7218
Submitted by:  Ian Tice