Abstract
We present an efficient second-order algorithm with Ō(1/η√T)1 regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by η, for a range of η restricted by the norm of the competitor. The family of loss functions ranges from hinge loss (η = 0) to squared hinge loss (η = 1). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for √T-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it also performs favorably against earlier algorithms.
Original language | English (US) |
---|---|
Title of host publication | 34th International Conference on Machine Learning, ICML 2017 |
Publisher | International Machine Learning Society (IMLS)[email protected] |
Pages | 742-755 |
Number of pages | 14 |
ISBN (Print) | 9781510855144 |
State | Published - Jan 1 2017 |
Externally published | Yes |