Department of Mathematical Sciences

Center for Nonlinear Analysis

CNA Home People Seminars Publications Workshops and Conferences CNA Working Groups CNA Comments Form Summer Schools Summer Undergraduate Institute PIRE Cooperation Graduate Topics Courses SIAM Chapter Seminar Positions Contact

Publication 18-CNA-022

Large data and zero noise limits of graph-based semi-supervised learning algorithms

Matthew M. Dunlop
Computing and Mathematical Sciences
Caltech
Pasadena, CA 91125
mdunlop@caltech.edu

Dejan Slepčev
Department of Mathematical Sciences
Carnegie Mellon University
Pittsburgh, PA 15213
slepcev@andrew.cmu.edu

Andrew M. Stuart
Computing and Mathematical Sciences
Caltech
Pasadena, CA 91125
astuart@caltech.edu

Matthew Thorpe
Department of Applied Mathematics and Theoretical Physics
University of Cambridge
Cambridge, UK
m.thorpe@maths.cam.ac.uk

Abstract: Scalings in which the graph Laplacian approaches a differential operator in the large graph limit are used to develop understanding of a number of algorithms for semi-supervised learning; in particular the extension, to this graph setting, of the probit algorithm, level set and kriging methods, are studied. Both optimization and Bayesian approaches are considered, based around a regularizing quadratic form found from an affine transformation of the Laplacian, raised to a, possibly fractional, exponent. Conditions on the parameters defining this quadratic form are identified under which well-defined limiting continuum analogues of the optimization and Bayesian semi-supervised learning problems may be found, thereby shedding light on the design of algorithms in the large graph setting. The large graph limits of the optimization formulations are tackled through Gamma-convergence, using the recently introduced $TL^p$ metric. The small labelling noise limit of the Bayesian formulations are also identified, and contrasted with pre-existing harmonic function approaches to the problem.

Get the paper in its entirety as 18-CNA-022.pdf

« Back to CNA Publications