Abstract
In high dimension, low sample size (HDLSS) settings, distance concentration phenomena affects the performance of several popular classifiers which are based on Euclidean distances. The behaviour of these classifiers in high dimensions is completely governed by the first and second order moments of the underlying class distributions. Moreover, the classifiers become useless for such HDLSS data when the first two moments of the competing distributions are equal, or when the moments do not exist. In this work, we propose robust, computationally efficient and tuning-free classifiers applicable in the HDLSS scenario. As the data dimension increases, these classifiers yield perfect classification if the one-dimensional marginals of the underlying distributions are different. We establish strong theoretical properties for the proposed classifiers in ultrahigh-dimensional settings. Numerical experiments with a wide variety of simulated examples and analysis of real data sets exhibit clear and convincing advantages over existing methods.
Original language | English (US) |
---|---|
Pages | 9943-9968 |
Number of pages | 26 |
State | Published - 2022 |
Event | 25th International Conference on Artificial Intelligence and Statistics, AISTATS 2022 - Virtual, Online, Spain Duration: Mar 28 2022 → Mar 30 2022 |
Conference
Conference | 25th International Conference on Artificial Intelligence and Statistics, AISTATS 2022 |
---|---|
Country/Territory | Spain |
City | Virtual, Online |
Period | 03/28/22 → 03/30/22 |
Bibliographical note
Funding Information:We thank the reviewers for their careful reading of an earlier version of the article and providing us with helpful comments. We would also like to thank Purushottam Kar and Soham Sarkar for their valuable inputs which improved this article.
Publisher Copyright:
Copyright © 2022 by the author(s)
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability