Abstract
We propose an algorithm for two-class classification problems when the training data are imbalanced. This means the number of training instances in one of the classes is so low that the conventional classification algorithms become ineffective in detecting the minority class. We present a modification of the kernel Fisher discriminant analysis such that the imbalanced nature of the problem is explicitly addressed in the new algorithm formulation. The new algorithm exploits the properties of the existing minority points to learn the effects of other minority data points, had they actually existed. The algorithm proceeds iteratively by employing the learned properties and conditional sampling in such a way that it generates sufficient artificial data points for the minority set, thus enhancing the detection probability of the minority class. Implementing the proposed method on a number of simulated and real data sets, we show that our proposed method performs competitively compared to a set of alternative state-of-the-art imbalanced classification algorithms.
Original language | English (US) |
---|---|
Pages (from-to) | 2695-2724 |
Number of pages | 30 |
Journal | Journal of Machine Learning Research |
Volume | 16 |
State | Published - Dec 2015 |
Externally published | Yes |
Bibliographical note
KAUST Repository Item: Exported on 2022-05-31Acknowledged KAUST grant number(s): KUS-CI-016-04
Acknowledgements: Arash Pourhabib, Yu Ding and Bani K. Mallick were partially supported by grants from NSF (DMS-0914951, CMMI-0926803, and CMMI-1000088) and King Abdullah University of Science and Technology (KUS-CI-016-04). The authors are also grateful of the valuable suggestions made by the editor and referees that greatly improved the paper.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Statistics and Probability
- Control and Systems Engineering