Center for Nonlinear Analysis
CNA Home
People
Seminars
Publications
Workshops and Conferences
CNA Working Groups
CNA Comments Form
Summer Schools
Summer Undergraduate Institute
PIRE
Cooperation
Graduate Topics Courses
SIAM Chapter Seminar
Positions
Contact |
CNA Seminar/Colloquium/Joint Pitt-CNA Colloquium
Jack Xin UC Irvine Title: Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks Abstract: Quantized deep neural networks (QDNNs) are attractive due to their much lower memory storage and faster inference speed than their regular full precision counterparts. To maintain the same performance level especially at low bit-widths, QDNNs must be retrained. Their training involves minimizing a piecewise constant non-convex objective in high dimension subject to a discrete constraint, hence mathematical challenges arise. We introduce the notion of coarse derivative and propose the blended coarse gradient descent (BCGD) algorithm. Coarse gradient is generally not a gradient of any function but an artificial descent direction. The network weight update of BCGD goes by coarse gradient correction of an average of the full precision weights and their quantization (the so-called blending), which yields sufficient descent in the objective value and accelerates the training. Our experiments demonstrate that this simple blending technique is very effective for quantization at extremely low bit-width such as binarization. For theoretical understanding, we show convergence analysis of coarse gradient descent on a two-layer neural network model with Gaussian input data, and prove that the expected coarse gradient correlates positively with the underlying true gradient.Recording: http://mm.math.cmu.edu/recordings/cna/Jackxin_small.mp4Date: Thursday, August 30, 2018Time: 1:30 pmLocation: Wean Hall 7218Submitted by: Ian Tice |