Abstract
Many state-of-the-art approaches for Multi Kernel Learning (MKL) struggle at finding a compromise between performance, sparsity of the solution and speed of the optimization process. In this paper we look at the MKL problem at the same time from a learning and optimization point of view. So, instead of designing a regularizer and then struggling to find an efficient method to minimize it, we design the regularizer while keeping the optimization algorithm in mind. Hence, we introduce a novel MKL formulation, which mixes elements of p-norm and elastic-net kind of regularization. We also propose a fast stochastic gradient descent method that solves the novel MKL formulation. We show theoretically and empirically that our method has 1) state-of-the-art performance on many classification tasks; 2) exact sparse solutions with a tunable level of sparsity; 3) a convergence rate bound that depends only logarithmically on the number of kernels used, and is independent of the sparsity required; 4) independence on the particular convex loss function used. Copyright 2011 by the author(s)/owner(s).
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the 28th International Conference on Machine Learning, ICML 2011 |
Pages | 249-256 |
Number of pages | 8 |
State | Published - Oct 7 2011 |
Externally published | Yes |