在统计学与机器学习中, 交叉验证被广泛应用于 评估模型的好坏. 但交叉验证法的表现一般不稳定, 因此评 估时通常需要进行多次交叉验证并通过求均值以提高交叉验证 算法的稳定性. 文章提出了一种基于空间填充准则改进的 折交叉验证方法, 它的思想是每一次划分的训练集和测试集均具有较好的均匀性. 模拟结果表明, 文章所提方法在五种分类模型(近邻,决策树,随机森林,支持向量机和 Adaboost)上对预测精度的估计均比普通折交叉验证的高. 将所提方法应用于骨质疏松实际数据分析中, 根据对预测精度的估计选择了最优的模型进行骨质疏松患者的分类预测.
In statistics and machine learning, cross-validation is widely used to evaluate the quality of a model. However, the results of cross-validation methods are generally unstable. Therefore, multiple cross-validation is usually required during evaluation and the average value is used to improve the cross-validation algorithm stability. This paper proposes an improved fold cross-validation method based on the space filling criterion. The idea is that each training and test set divided has better uniformity. The simulation results show that our proposed method estimates the prediction accuracy of five classification models ( nearest neighbor, decision tree, random forest, support vector machine, and Adaboost) than ordinary -fold cross-validation for prediction accuracy of the estimation is higher. We applied the proposed method to the analysis of actual osteoporosis data, and selected the best model for the classification and prediction of osteoporosis patients based on the estimation of the prediction accuracy.