Abstract
Extending previous work on quantile classifiers (
-classifiers) we propose the
*-classifier for the class imbalance problem. The classifier assigns a sample to the minority class if the minority class conditional probability exceeds 0
*
1, where
* equals the unconditional probability of observing a minority class sample. The motivation for
*-classification stems from a density-based approach and leads to the useful property that the
*-classifier maximizes the sum of the true positive and true negative rates. Moreover, because the procedure can be equivalently expressed as a cost-weighted Bayes classifier, it also minimizes weighted risk. Because of this dual optimization, the
*-classifier can achieve near zero risk in imbalance problems, while simultaneously optimizing true positive and true negative rates. We use random forests to apply
*-classification. This new method which we call RFQ is shown to outperform or is competitive with existing techniques with respect to
-mean performance and variable selection. Extensions to the multiclass imbalanced setting are also considered.