Fernando de la Torre, Computer Science, Carnegie Mellon University
"Learning the representation for modeling, classification and clustering problems with energy-based component analysis methods"
Selecting a good representation of the data is a key aspect of the success of any modeling, classification or clustering algorithm. Component Analysis (CA) methods (e.g. Kernel Principal Component Analysis, Independent Component Analysis, Tensor factorization) have been used as a feature extraction step for modeling, classification and clustering in numerous visual, graphics and signal processing tasks over the last four decades. CA techniques are especially appealing because many can be formulated as eigen-problems, offering great potential for efficient learning of linear and non-linear representations of the data without local minima. However, the eigen-formulation often hides important aspects of making the learning successful such as understanding normalization factors, how to build invariant representations of geometric transformations (e.g. translation), effects of noise and missing data or how to select the kernel. In this talk, I will describe a unified framework for energy-based learning in CA methods. I will point out how apparently different learning tasks (clustering, classification, modeling) collapse into a single task when viewed from the perspective of energy functions. Moreover, I will propose several extensions of CA methods to learn linear and non-linear representations of data to improve performance, over the current use of CA features, in state-of-the-art algorithms for classification (e.g. support vector machines), clustering (e.g. spectral graph methods) and modeling/tracking (e.g. active appearance models) problems. In this talk I will emphasize how many learning algorithms are related to an optimization problem, and I will show different optimization strategies to learn from very high dimensional data.
TUESDAY, April 3, 2007