基于$K$折交叉验证Beta分布的AUC度量的置信区间

王钰; 赵晓艳; 杨杏丽; 李济洪

doi:10.12341/jssms13965

PDF(519 KB)

系统科学与数学 ›› 2020, Vol. 40 ›› Issue (9) : 1564-1577. DOI: 10.12341/jssms13965

论文

基于 $K$ 折交叉验证Beta分布的AUC度量的置信区间

^1,2,3

作者信息 +

Confidence Interval of AUC Measure Based on $K$ -Fold Cross-Validated Beta Distribution

^1,2,3

Author information +

文章历史 +

摘要

在统计机器学习研究中, 基于 $K$ 折交叉验证的AUC (Area Under ROC Curve) 度量常常被用作分类算法性能的评价. 然而, 点估计显然没有考虑方差的信息, 为此, 基于正态假定的 $K$ 折交叉验证 $t$ 分布构造的AUC度量的通用对称置信区间(区间估计) 被提出. 但是, 这些对称置信区间往往表现出低的置信度或长的区间长度, 从而容易导致激进的(liberal) 统计推断结果. 通过对AUC度量的理论分析, 发现AUC度量的真实分布实际上是非对称的, 此时简单使用对称分布去近似它显然是不合适的. 因此, 针对二类分类问题, 本文提出了一种新的基于 $K$ 折交叉验证Beta分布的AUC度量的非对称置信区间, 在模拟和真实数据实验上验证了提出的置信区间相对于传统的基于 $K$ 折交叉验证 $t$ 分布的对称置信区间的优越性.

Abstract

In statistical machine learning research, the AUC (Area Under ROC Curve) measure based on $K$ -fold cross-validation is always used to measure the classification algorithm performance. However, the point estimation obviously does not consider the variance information. For this reason, the commonly used symmetrical confidence interval (interval estimation) of AUC measure constructed by the $K$ -fold cross-validated $t$ distribution based on the normal assumption is proposed. But these symmetrical confidence intervals always exhibit low degrees of confidence or long interval lengths. This may easily result in liberal statistical inference results. Through the theoretical analysis of AUC measure, we find that the real distribution of AUC measure is actually asymmetrical. In this case, it is obviously inappropriate to use symmetrical distribution to approximate asymmetrical distribution. Therefore, for the two-class classification problem, this paper proposes a new asymmetrical confidence interval based on $K$ -fold cross-validated Beta distribution. Simulated and real data experiments show the superiority of the proposed confidence interval compared to the traditional symmetrical confidence interval based on $K$ -fold cross-validated $t$ distribution.

关键词

AUC度量 / 置信区间 / Beta分布 / $K$ 折交叉验证.

引用本文

EndNote

Ris (Procite)

Bibtex

导出引用

王钰 , 赵晓艳 , 杨杏丽 , 李济洪. 基于

K

折交叉验证Beta分布的AUC度量的置信区间. 系统科学与数学, 2020, 40(9): 1564-1577. https://doi.org/10.12341/jssms13965

WANG Yu , ZHAO Xiaoyan , YANG Xingli , LI Jihong. Confidence Interval of AUC Measure Based on

K

-Fold Cross-Validated Beta Distribution. Journal of Systems Science and Mathematical Sciences, 2020, 40(9): 1564-1577 https://doi.org/10.12341/jssms13965

PDF(519 KB)

499

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
引用本文

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

{{custom_fnGroup.title_cn}}

脚注

扫码分享

出版日期
2020-09-25
发布日期
2020-11-16

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

{{custom_fnGroup.title_cn}}

脚注