Abstract
I present a new online learning algorithm that extends the exponentiated gradient framework to infinite dimensional spaces. My analysis shows that the algorithm is implicitly able to estimate the L2 norm of the unknown competitor, U, achieving a regret bound of the order of O(U log(U T + 1)) √ T), instead of the standard O((U2 + 1) √ T), achievable without knowing U. For this analysis, I introduce novel tools for algorithms with time-varying regularizers, through the use of local smoothness. Through a lower bound, I also show that the algorithm is optimal up to √ log(UT) term for linear and Lipschitz losses.
Original language | English (US) |
---|---|
Title of host publication | Advances in Neural Information Processing Systems |
Publisher | Neural information processing systems foundation |
State | Published - Jan 1 2013 |
Externally published | Yes |