Publication 18-CNA-022
Large data and zero noise limits of graph-based semi-supervised learning algorithms
Matthew M. Dunlop
Computing and Mathematical Sciences
Caltech
Pasadena, CA 91125
mdunlop@caltech.edu
Dejan Slepčev
Department of Mathematical Sciences
Carnegie Mellon University
Pittsburgh, PA 15213
slepcev@andrew.cmu.edu
Andrew M. Stuart
Computing and Mathematical Sciences
Caltech
Pasadena, CA 91125
astuart@caltech.edu
Matthew Thorpe
Department of Applied Mathematics and Theoretical Physics
University of Cambridge
Cambridge, UK
m.thorpe@maths.cam.ac.uk
Abstract: Scalings in which the graph Laplacian approaches a differential operator in the large graph limit are used to develop understanding of a number of algorithms for semi-supervised learning; in particular the extension, to this graph setting, of the probit algorithm,
level set and kriging methods, are studied. Both optimization and Bayesian approaches are considered, based around a regularizing quadratic form found from an affine transformation of the Laplacian, raised to a, possibly fractional, exponent. Conditions on the parameters defining this quadratic form are identified under which well-defined
limiting continuum analogues of the optimization and Bayesian semi-supervised learning problems may be found, thereby shedding light on the design of algorithms in the large graph setting. The large graph limits of the optimization formulations are tackled through Gamma-convergence, using the recently introduced $TL^p$ metric. The small labelling noise limit of the Bayesian formulations are also identified, and contrasted with
pre-existing harmonic function approaches to the problem.
Get the paper in its entirety as 18-CNA-022.pdf
« Back to CNA Publications