Publication 17-CNA-012
Analysis of $p$-Laplacian Regularization in Semi-Supervised Learning
Dejan Slepčev
Department of Mathematical Sciences
Carnegie Mellon University
Pittsburgh, PA 15213
slepcev@andrew.cmu.edu
Matthew Thorpe
Department of Applied Mathematics and Theoretical Physics
University of Cambridge
Cambridge, UK
m.thorpe@maths.cam.ac.uk
Abstract: We investigate a family of regression problems in a
semi-supervised setting.
The task is to assign real-valued labels to a set of $n$ sample points,
provided a small training subset of $N$ labeled points.
A goal of semi-supervised learning is to take advantage of the
(geometric) structure provided by the large number of unlabeled data
when assigning labels. We consider random geometric graphs, with
connection radius $\varepsilon(n)$, to represent the geometry of the data
set. Functionals which model the task reward the regularity of the
estimator function and impose or reward the agreement with the training
data. Here we consider the discrete $p$-Laplacian regularization.
We investigate asymptotic behavior when the number of unlabeled points
increases, while the number of training points remains fixed. We
uncover a delicate interplay between the regularizing nature of the
functionals considered and the nonlocality inherent to the graph
constructions.
We rigorously obtain almost optimal ranges on the scaling of $\varepsilon(n)$
for the asymptotic consistency to hold. We prove that the minimizers of
the discrete functionals in random setting converge uniformly to the
desired continuum limit.
Furthermore we discover that for the standard model used there is a
restrictive upper bound on how quickly $\varepsilon(n)$ must converge to zero
as $n \to \infty$. We introduce a new model which is as simple as the
original model, but overcomes this restriction.
Get the paper in its entirety as 17-CNA-012.pdf
« Back to CNA Publications