TY - GEN
T1 - Robust bounds for classification via selective sampling
AU - Nicolò, Cesa Bianchi
AU - Gentile, Claudio
AU - Orabona, Francesco
N1 - Generated from Scopus record by KAUST IRTS on 2023-09-25
PY - 2009/9/15
Y1 - 2009/9/15
N2 - We introduce a new algorithm for binary classification in the selective sampling protocol. Our algorithm uses Regularized Least Squares (RLS) as base classifier, and for this reason it can be efficiently run in any RKHS. Unlike previous margin-based semisupervised algorithms, our sampling condition hinges on a simultaneous upper bound on bias and variance of the RLS estimate under a simple linear label noise model. This fact allows us to prove performance bounds that hold for an arbitrary sequence of instances. In particular, we show that our sampling strategy approximates the margin of the Bayes optimal classifier to any desired accuracy ε by asking eÕ(d/ε2)queries (in the RKHS case d is replaced by a suitable spectral quantity). While these are the standard rates in the fully supervised i.i.d. case, the best previously known result in our harder setting was eÕ(d3/ε 4). Preliminary experiments show that some of our algorithms also exhibit a good practical performance. Copyright 2009.
AB - We introduce a new algorithm for binary classification in the selective sampling protocol. Our algorithm uses Regularized Least Squares (RLS) as base classifier, and for this reason it can be efficiently run in any RKHS. Unlike previous margin-based semisupervised algorithms, our sampling condition hinges on a simultaneous upper bound on bias and variance of the RLS estimate under a simple linear label noise model. This fact allows us to prove performance bounds that hold for an arbitrary sequence of instances. In particular, we show that our sampling strategy approximates the margin of the Bayes optimal classifier to any desired accuracy ε by asking eÕ(d/ε2)queries (in the RKHS case d is replaced by a suitable spectral quantity). While these are the standard rates in the fully supervised i.i.d. case, the best previously known result in our harder setting was eÕ(d3/ε 4). Preliminary experiments show that some of our algorithms also exhibit a good practical performance. Copyright 2009.
UR - https://dl.acm.org/doi/10.1145/1553374.1553390
UR - http://www.scopus.com/inward/record.url?scp=70049110224&partnerID=8YFLogxK
U2 - 10.1145/1553374.1553390
DO - 10.1145/1553374.1553390
M3 - Conference contribution
SN - 9781605585161
BT - ACM International Conference Proceeding Series
ER -