海量数据下模型平均的分治算法

方方,尹相菊,张强

系统科学与数学 ›› 2018, Vol. 38 ›› Issue (7) : 764-776.

PDF(465 KB)
PDF(465 KB)
系统科学与数学 ›› 2018, Vol. 38 ›› Issue (7) : 764-776. DOI: 10.12341/jssms13416
论文

海量数据下模型平均的分治算法

    方方,尹相菊,张强
作者信息 +

Divide and Conquer Algorithms for Model Averaging with Massive Data

    FANG Fang ,YIN Xiangju, ZHANG Qiang
Author information +
文章历史 +

摘要

随着数据收集技术在近年来的飞速发展, 传统的统计方法都面临着``海量数据"的挑战. 分治算法是应对海量数据的最有效方法之一: 其基本思想是将整个数据集分成若干份较小的数据, 在每份数据上单独拟合统计模型, 然后将多个模型的结果进行整合从而得到最终的结果. 模型平均是当代统计学和计量经济学研究的国际前沿方法, 在经济、金融、生物、医学等方面有着 广泛的应用. 针对线性模型的MMA和JMA方法, 以及广义线性模型的模型平均方法, 文章分别提 出了它们在海量数据下的分治算法, 并通过模拟和实际数据分析来说明算法的有效性和实用性.

Abstract

With the rapid development of data collection techniques in recent years, traditional statistical methods face the challenge of ``massive data''. Divide and conquer is one of the most efficient ways to deal with massive data. Its basic idea is to divide the whole data to several subsets, fit a statistical model in each single subset, and combine the results from all the subsets to obtain the final result. Model averaging is a frontier method in statistics and economics. It has wide applications in many areas such as economics, finance, biology and medicine. In this paper, we study the divide and conquer algorithms for Mallows model averaging, Jackknife model averaging and model averaging for generalized linear models. Empirical results are provided to support the proposed algorithms.

引用本文

导出引用
方方 , 尹相菊 , 张强. 海量数据下模型平均的分治算法. 系统科学与数学, 2018, 38(7): 764-776. https://doi.org/10.12341/jssms13416
FANG Fang , YIN Xiangju , ZHANG Qiang. Divide and Conquer Algorithms for Model Averaging with Massive Data. Journal of Systems Science and Mathematical Sciences, 2018, 38(7): 764-776 https://doi.org/10.12341/jssms13416
PDF(465 KB)

349

Accesses

0

Citation

Detail

段落导航
相关文章

/