The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study
X. Liu, and Z. Zhou. Data Mining, 2006. ICDM '06. Sixth International Conference on, page 970-974. (December 2006)
DOI: 10.1109/ICDM.2006.158
Abstract
In real-world applications the number of examples in one class may overwhelm the other class, but the primary interest is usually on the minor class. Cost-sensitive learning has been deeded as a good solution to these class-imbalanced tasks, yet it is not clear how does the class-imbalance affect cost-sensitive classifiers. This paper presents an empirical study using 38 data sets, which discloses that class-imbalance often affects the performance of cost-sensitive classifiers: When the misclassification costs are not seriously unequal, cost-sensitive classifiers generally favor natural class distribution although it might be imbalanced; while when misclassification costs are seriously unequal, a balanced class distribution is more favorable.
Description
IEEE Xplore Full-Text HTML : The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study
%0 Conference Paper
%1 4053137
%A Liu, Xu-Ying
%A Zhou, Zhi-Hua
%B Data Mining, 2006. ICDM '06. Sixth International Conference on
%D 2006
%K auc class cost evaluation imbalance sensitive
%P 970-974
%R 10.1109/ICDM.2006.158
%T The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study
%U http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=4053137
%X In real-world applications the number of examples in one class may overwhelm the other class, but the primary interest is usually on the minor class. Cost-sensitive learning has been deeded as a good solution to these class-imbalanced tasks, yet it is not clear how does the class-imbalance affect cost-sensitive classifiers. This paper presents an empirical study using 38 data sets, which discloses that class-imbalance often affects the performance of cost-sensitive classifiers: When the misclassification costs are not seriously unequal, cost-sensitive classifiers generally favor natural class distribution although it might be imbalanced; while when misclassification costs are seriously unequal, a balanced class distribution is more favorable.
@inproceedings{4053137,
abstract = {In real-world applications the number of examples in one class may overwhelm the other class, but the primary interest is usually on the minor class. Cost-sensitive learning has been deeded as a good solution to these class-imbalanced tasks, yet it is not clear how does the class-imbalance affect cost-sensitive classifiers. This paper presents an empirical study using 38 data sets, which discloses that class-imbalance often affects the performance of cost-sensitive classifiers: When the misclassification costs are not seriously unequal, cost-sensitive classifiers generally favor natural class distribution although it might be imbalanced; while when misclassification costs are seriously unequal, a balanced class distribution is more favorable.},
added-at = {2014-06-03T16:56:46.000+0200},
author = {Liu, Xu-Ying and Zhou, Zhi-Hua},
biburl = {https://www.bibsonomy.org/bibtex/2f40094c843478ae0be7d72b594f4133f/jil},
booktitle = {Data Mining, 2006. ICDM '06. Sixth International Conference on},
description = {IEEE Xplore Full-Text HTML : The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study},
doi = {10.1109/ICDM.2006.158},
interhash = {a57e979401f966c14947421fe282f9d0},
intrahash = {f40094c843478ae0be7d72b594f4133f},
issn = {1550-4786},
keywords = {auc class cost evaluation imbalance sensitive},
month = dec,
pages = {970-974},
timestamp = {2014-06-03T16:56:46.000+0200},
title = {The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study},
url = {http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=4053137},
year = 2006
}