"Most classification algorithms will only perform optimally when the number of samples of each class is roughly the same. Highly skewed datasets, where the minority is heavily outnumbered by one or more classes, have proven to be a challenge while at the same time becoming more and more common."
Does DAAL has processing algorithms to support the classification of imbalance data sets? I didn't found anything via the user guide document. Thanks!
There are several techniques to deal with the imbalanced data sets including
Choice of the method is defined by the characteristics of the imbalanced data set.
Intel DAAL provides several boosting algorithms including AdaBoost and BrownBoost for binary classification, and LogitBoost for multiclass classification. Please, have a look at those algorithms from perspective of improvement of the classification accuracy for your imbalanced data sets. Let us know, if those approaches do not work in your application, and/or you have in mind other techniques that may be useful for you but are still missed in the library