Abstract
In this paper, we study the probabilistic properties of pattern classifiers in discrete feature space. The principle of Bayesian averaging of recognition performance is used for this analysis. We consider two cases: (a) prior probabilities of classes are unknown, and (b) prior probabilities of classes are known. The misclassification probability is represented as a random value, for which the characteristic function (expressed via Kummer hypergeometric function) and absolute moments are analytically derived. For the case of unknown priors, an approximate formula for calculation of sufficient learning sample size is obtained. The comparison between the performances for two considered cases is made. As an example, we consider the problem of mutational hotspots classification in genetic sequences.
Original language | English (US) |
---|---|
Pages (from-to) | 2537-2548 |
Number of pages | 12 |
Journal | Pattern Recognition Letters |
Volume | 24 |
Issue number | 15 |
DOIs | |
State | Published - Nov 2003 |
Externally published | Yes |
Keywords
- Bayesian averaging
- Classifier analysis
- Discrete features
- Error estimation
- Generalization error
ASJC Scopus subject areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence