基于方差与改进群智能算法的K-means聚类优化

王丝丝,张敬磊,陈慈,张洪宾,马春杰

系统科学与数学 ›› 2018, Vol. 38 ›› Issue (10) : 1117-1127.

PDF(790 KB)
PDF(790 KB)
系统科学与数学 ›› 2018, Vol. 38 ›› Issue (10) : 1117-1127. DOI: 10.12341/jssms13460
论文

基于方差与改进群智能算法的K-means聚类优化

    王丝丝1,张敬磊1,陈慈1,张洪宾1,马春杰2
作者信息 +

Optimization of K-Means Based on Variance Statistics and Improved Swarm Intelligent Algorithm

    WANG Sisi1 ,ZHANG Jinglei1 ,CHEN Ci1 ,ZHANG Hongbin1 ,MA Chunjie2
Author information +
文章历史 +

摘要

利用K-means进行数据聚类时,借用不同处理手段其统计距离和聚类中心等会有所差异,从而影响聚类结果,尤其是当数据维度增高时,这种现象更为明显.对此,文章提出一种基于样本方差的多元统计距离算法,并引入改进人工蜂群算法及评价准则函数确定聚类中心和最佳聚类数,优化K-means算法.理论上,该方法可以克服原算法易陷入局部最优和固定聚类数等缺陷.最后,通过特异值检测, 人工数据集以及UCI 真实数据集测试验证该优化算法性能.

Abstract

The distance and cluster centers will be infected by different methods affecting the results, especially analyzing the high-dimension data when K-means was applied to data clustering. For that, a multivariate distance algorithm based on sample variance is proposed to measure distance and an improved artificial bee colony algorithm and evaluation criteria function are used to calculate the cluster position and best number of clusters. In theory, this method can overcome the disadvantages including local optimum, and fixed cluster amounts of K-means. Finally, the performance of the algorithm is verified on the specific value detection, artificial datasets and UCI datasets.

关键词

样本方差 / 人工蜂群算法 / 算术交叉 / 最佳聚类数.

引用本文

导出引用
王丝丝 , 张敬磊 , 陈慈 , 张洪宾 , 马春杰. 基于方差与改进群智能算法的K-means聚类优化. 系统科学与数学, 2018, 38(10): 1117-1127. https://doi.org/10.12341/jssms13460
WANG Sisi , ZHANG Jinglei , CHEN Ci , ZHANG Hongbin , MA Chunjie. Optimization of K-Means Based on Variance Statistics and Improved Swarm Intelligent Algorithm. Journal of Systems Science and Mathematical Sciences, 2018, 38(10): 1117-1127 https://doi.org/10.12341/jssms13460
PDF(790 KB)

Accesses

Citation

Detail

段落导航
相关文章

/