基于K折交叉验证Beta分布的AUC度量的置信区间

王钰,赵晓艳,杨杏丽,李济洪

系统科学与数学 ›› 2020, Vol. 40 ›› Issue (9) : 1564-1577.

PDF(519 KB)
PDF(519 KB)
系统科学与数学 ›› 2020, Vol. 40 ›› Issue (9) : 1564-1577. DOI: 10.12341/jssms13965
论文

基于K折交叉验证Beta分布的AUC度量的置信区间

    王钰1,2,3,赵晓艳2,杨杏丽2,李济洪3
作者信息 +

Confidence Interval of AUC Measure Based on K-Fold Cross-Validated Beta Distribution

    WANG Yu 1,2,3,ZHAO Xiaoyan 2,YANG Xingli 2,LI Jihong3
Author information +
文章历史 +

摘要

在统计机器学习研究中, 基于K折交叉验证的AUC (Area Under ROC Curve) 度量常常被用作分类算法性能的评价. 然而, 点估计显然没有考虑方差的信息, 为此, 基于正态假定的K折交叉验证t分布构造的AUC度量的通用对称置信区间(区间估计) 被提出. 但是, 这些对称置信区间往往表现出低的置信度或长的区间长度, 从而容易导致激进的(liberal) 统计推断结果. 通过对AUC度量的理论分析, 发现AUC度量的真实分布实际上是非对称的, 此时简单使用对称分布去近似它显然是不合适的. 因此, 针对二类分类问题, 本文提出了一种新的基于K折交叉验证Beta分布的AUC度量的非对称置信区间, 在模拟和真实数据实验上验证了提出的置信区间相对于传统的基于K折交叉验证t分布的对称置信区间的优越性.

Abstract

In statistical machine learning research, the AUC (Area Under ROC Curve) measure based on K-fold cross-validation is always used to measure the classification algorithm performance. However, the point estimation obviously does not consider the variance information. For this reason, the commonly used symmetrical confidence interval (interval estimation) of AUC measure constructed by the K-fold cross-validated t distribution based on the normal assumption is proposed. But these symmetrical confidence intervals always exhibit low degrees of confidence or long interval lengths. This may easily result in liberal statistical inference results. Through the theoretical analysis of AUC measure, we find that the real distribution of AUC measure is actually asymmetrical. In this case, it is obviously inappropriate to use symmetrical distribution to approximate asymmetrical distribution. Therefore, for the two-class classification problem, this paper proposes a new asymmetrical confidence interval based on K-fold cross-validated Beta distribution. Simulated and real data experiments show the superiority of the proposed confidence interval compared to the traditional symmetrical confidence interval based on K-fold cross-validated t distribution.

关键词

AUC度量 / 置信区间 / Beta分布 / K折交叉验证.

引用本文

导出引用
王钰 , 赵晓艳 , 杨杏丽 , 李济洪. 基于K折交叉验证Beta分布的AUC度量的置信区间. 系统科学与数学, 2020, 40(9): 1564-1577. https://doi.org/10.12341/jssms13965
WANG Yu , ZHAO Xiaoyan , YANG Xingli , LI Jihong. Confidence Interval of AUC Measure Based on K-Fold Cross-Validated Beta Distribution. Journal of Systems Science and Mathematical Sciences, 2020, 40(9): 1564-1577 https://doi.org/10.12341/jssms13965
PDF(519 KB)

499

Accesses

0

Citation

Detail

段落导航
相关文章

/