Abstract
Recent advances in ℓ1 optimization for imaging problems provide promising tools to solve the fundamental high-dimensional data classification in machine learning. In this paper, we extend the main result of Szlam and Bresson (Proceedings of the 27th International Conference on Machine Learning, pp. 1039-1046, 2010), which introduced an exact ℓ1 relaxation of the Cheeger ratio cut problem for unsupervised data classification. The proposed extension deals with the multi-class transductive learning problem, which consists in learning several classes with a set of labels for each class. Learning several classes (i.e. more than two classes) simultaneously is generally a challenging problem, but the proposed method builds on strong results introduced in imaging to overcome the multi-class issue. Besides, the proposed multi-class transductive learning algorithms also benefit from recent fast ℓ1 solvers, specifically designed for the total variation norm, which plays a central role in our approach. Finally, experiments demonstrate that the proposed ℓ1 relaxation algorithms are more accurate and robust than standard ℓ2 relaxation methods s.a. spectral clustering, particularly when considering a very small number of labels for each class to be classified. For instance, the mean error of classification for the benchmark MNIST dataset of 60,000 data in \mathbb{R}^{784} using the proposed ℓ1 relaxation of the multi-class Cheeger cut is 2.4 % when only one label is considered for each class, while the error of classification for the ℓ2 relaxation method of spectral clustering is 24.7 %.
Original language | English (US) |
---|---|
Pages (from-to) | 191-201 |
Number of pages | 11 |
Journal | JOURNAL OF MATHEMATICAL IMAGING AND VISION |
Volume | 49 |
Issue number | 1 |
DOIs | |
State | Published - May 2014 |
Externally published | Yes |
Bibliographical note
Funding Information:Acknowledgement Xavier Bresson is supported by the Hong Kong RGC under Grant GRF110311.
Keywords
- Clustering
- Data analysis
- Exact relaxation
- Fast L1 optimization
- Min cut
- Multi-class
- Potts and Mumford-Shah energies
- Ratio cut
- Total variation
- Transductive learning
ASJC Scopus subject areas
- Statistics and Probability
- Modeling and Simulation
- Condensed Matter Physics
- Computer Vision and Pattern Recognition
- Geometry and Topology
- Applied Mathematics