Center for                           Nonlinear Analysis CNA Home People Seminars Publications Workshops and Conferences CNA Working Groups CNA Comments Form Summer Schools Summer Undergraduate Institute PIRE Cooperation Graduate Topics Courses SIAM Chapter Seminar Positions Contact Publication 18-CNA-022 Large data and zero noise limits of graph-based semi-supervised learning algorithms Matthew M. DunlopComputing and Mathematical SciencesCaltechPasadena, CA 91125mdunlop@caltech.edu Dejan SlepčevDepartment of Mathematical Sciences Carnegie Mellon University Pittsburgh, PA 15213slepcev@andrew.cmu.edu Andrew M. StuartComputing and Mathematical SciencesCaltechPasadena, CA 91125astuart@caltech.edu Matthew ThorpeDepartment of Applied Mathematics and Theoretical Physics University of Cambridge Cambridge, UKm.thorpe@maths.cam.ac.ukAbstract: Scalings in which the graph Laplacian approaches a differential operator in the large graph limit are used to develop understanding of a number of algorithms for semi-supervised learning; in particular the extension, to this graph setting, of the probit algorithm, level set and kriging methods, are studied. Both optimization and Bayesian approaches are considered, based around a regularizing quadratic form found from an affine transformation of the Laplacian, raised to a, possibly fractional, exponent. Conditions on the parameters defining this quadratic form are identified under which well-defined limiting continuum analogues of the optimization and Bayesian semi-supervised learning problems may be found, thereby shedding light on the design of algorithms in the large graph setting. The large graph limits of the optimization formulations are tackled through Gamma-convergence, using the recently introduced $TL^p$ metric. The small labelling noise limit of the Bayesian formulations are also identified, and contrasted with pre-existing harmonic function approaches to the problem.Get the paper in its entirety as  18-CNA-022.pdf