Previous |  Up |  Next

Article

Keywords:
classifier; performance evaluation; misclassification costs; cost curves; ROC curves; AUC
Summary:
Performance evaluation of classifiers is a crucial step for selecting the best classifier or the best set of parameters for a classifier. Receiver Operating Characteristic (ROC) curves and Area Under the ROC Curve (AUC) are widely used to analyse performance of a classifier. However, the approach does not take into account that misclassification for different classes might have more or less serious consequences. On the other hand, it is often difficult to specify exactly the consequences or costs of misclassifications. This paper is devoted to Relative Cost Curves (RCC) - a graphical technique for visualising the performance of binary classifiers over the full range of possible relative misclassification costs. This curve provides helpful information to choose the best set of classifiers or to estimate misclassification costs if those are not known precisely. In this paper, the concept of Area Above the RCC (AAC) is introduced, a scalar measure of classifier performance under unequal misclassification costs problem. We also extend RCC to multicategory problems when misclassification costs depend only on the true class.
References:
[1] Bradley, A. P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30 (1997), 1145-1159. DOI 10.1016/S0031-3203(96)00142-2
[2] Drummond, C., Holte, R. C.: Cost curves: An improved method for visualizing classifier performance. Machine Learning 65 (2006) 95-130. DOI 10.1007/s10994-006-8199-5
[3] Fawcett, T.: An introduction to roc analysis. Pattern Recognition Lett. 27 (2006), 861-874. DOI 10.1016/j.patrec.2005.10.010
[4] Hand, D. J.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Machine Learning 77 (2009), 103-123. DOI 10.1007/s10994-009-5119-5
[5] Hand, D. J., Till, R. J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45 (2001), 171-186. DOI 10.1023/A:1010920819831 | Zbl 1007.68180
[6] Hanley, J. A.: Receiver operating characteristic (ROC) methodology: the state of the art. Critical Reviews in Diagnostic Imaging 29 (1989), 307-335.
[7] Hernández-Orallo, J., Flach, P., Ferri, C.: Brier curves: a new cost-based visualisation of classifier performance. In: Proc. 28th International Conference on Machine Learning (ICML-11) (L. Getoor and T. Scheffer, eds.), ACM, New York 2011, pp. 585-592.
[8] Klawonn, F., Höppner, F., May, S.: An alternative to ROC and AUC analysis of classifiers. In: Advances in Intelligent Data Analysis X, (J. Gama, E. Bradley, and J. Hollmén, eds.), Springer, Berlin 2011, p. 210-221.
[9] Krzanowski, W. J., Hand, D. J.: ROC Curves for Continuous data. Chapman and Hall, London 2009. MR 2522628 | Zbl 1288.62005
[10] Li, J., Fine, J. P.: ROC analysis with multiple classes and multiple tests: methodology and its application in microarray studies. Biostatistics 9 (2008), 566-576. DOI 10.1093/biostatistics/kxm050 | Zbl 1143.62083
[11] Murphy, P. M., Aha, D. W.: Uci repository of machine learning databases. 1992. Avaible: http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
[12] Mossman, D.: Three-way ROCs. Medical Decision Making 19 (1999), 78-89. DOI 10.1177/0272989X9901900110
Partner of
EuDML logo