
大数据下广义线性模型的参数估计算法
Parameter Estimation Algorithm of Generalized Linear Model for Big Data
在大数据下, 全样本量很大, 未知参数极大似然估计的计算变得十分困难. 文章主要对于广义线性模型参数的极大似然估计研究一种有效的计算方法. 首先证明了随机抽样算法下的估计量的渐近正态性, 由此提出了入样概率的选取准则及两步随机抽样算法. 模拟研究表明, 绝大部分情况下, 运用文章提出的方法所得到广义线性模型极大似然估计量的均方误差低于与之对比的简单随机抽样.
In big data era, the calculation of the maximum likelihood estimator for unknown parameter becomes very difficult due to the large full sample size. This paper studies an effective calculation method for the maximum likelihood estimation of parameter in generalized linear model. First, the asymptotic normality of the estimator under the random sampling algorithm is proved, and then the selection criteria of sampling probability and the two-step random sampling algorithm are proposed. The results of simulation study indicate that, in most cases, the mean square error of the maximum likelihood estimator of the generalized linear model using the proposed method is smaller than that of the simple random sampling.
大数据 / 广义线性模型 / 两步随机抽样算法 / 渐近正态性 / 入样概率. {{custom_keyword}} /
/
〈 |
|
〉 |