• • 上一篇    下一篇

分位数回归提升树模型及应用

蔡超, 黄聪聪, 董皓天   

  1. 山东工商学院统计学院, 烟台 264005
  • 收稿日期:2021-09-10 修回日期:2021-12-27 出版日期:2022-05-25 发布日期:2022-07-23
  • 通讯作者: 蔡超,Email:caichao622@126.com.
  • 基金资助:
    国家社会科学基金项目(20BTJ052),山东省社会科学规划研究项目(20CTJJ01),全国统计科学研究一般项目(2019LY101)资助课题.

蔡超, 黄聪聪, 董皓天. 分位数回归提升树模型及应用[J]. 系统科学与数学, 2022, 42(5): 1216-1233.

CAI Chao, HUANG Congcong, DONG Haotian. Quantile Regression Boosting Tree and Its Application[J]. Journal of Systems Science and Mathematical Sciences, 2022, 42(5): 1216-1233.

Quantile Regression Boosting Tree and Its Application

CAI Chao, HUANG Congcong, DONG Haotian   

  1. School of Statistics, Shandong Technology and Business University, Yantai 264005
  • Received:2021-09-10 Revised:2021-12-27 Online:2022-05-25 Published:2022-07-23
为解决分位数回归树模型预测性能低以及分位数回归梯度提升树模型计算成本高的缺陷,文章基于分位数回归方法和提升树模型,提出了分位数回归提升树模型(QRBT),并给出了其具体算法.该模型一方面优化过程更为简单,节约了计算成本,另一方面加总多个分位数回归树模型,提高了预测性能.通过数值模拟和应用研究发现:与线性分位数回归,分位数回归树以及分位数回归梯度提升树模型相比,QRBT模型不仅能够获得更高的估计和预测精度,而且能够显著地降低运行时间.
In order to solve the shortcomings of the low prediction performance of the quantile regression tree model and the high computational cost of the quantile regression gradient boosting tree model, this paper proposes the quantile regression boosting tree model (QRBT) based on the quantile regression and boosting tree model and its specific algorithm is given. On the one hand, the optimization process of this model is simpler, which saves the calculation cost. On the other hand, the sum of multiple quantile regression tree models improves the prediction performance. Through numerical simulation and application research, it is found that the QRBT model can obtain higher estimation and prediction accuracy compared with quantile regression, quantile regression tree, and quantile regression gradient boosting tree model. In addition, compared with quantile regression gradient boosting tree model, QRBT model significantly reduces the running time.

MR(2010)主题分类: 

()
[1] Koenker R, Bassett G W. Regression quantiles. Econometrica, 1978, 46(1):211-244.
[2] 任仙玲,邓磊.基于Copula分位数回归原油期货市场套保模型及效率研究.数理统计与管理, 2020,, 39(4):746-760.(Ren X L, Deng L. Research on the hedging model and efficiency based on Copula quantile regression in crude oil futures market. Journal of Applied Statistics and Management, 2020, 39(4):746-760.)
[3] 王旭,王应明,温槟檐.技术异质性视角下中国工业能源环境效率时空演化及其驱动机制研究.系统科学与数学, 2020, 40(12):2297-2319.(Wang X, Wang Y M, Wen B Y. Spatial evolution and driving mechanism of China's industrial energy and environment efficiency from the perspective of technological heterogeneity. Journal of Systems Science and Mathematical Sciences, 2020, 40(12):2297-2319.)
[4] Chaudhuri P, Loh W Y. Nonparametric estimation of conditional quantiles using quantile regression trees. Bernoulli, 2002, 8(5):561-576.
[5] Hwang C H. Support vector quantile regression for longitudinal data. Journal of the Korean Data and Information Science Society, 2010, 21(2):1539-1547.
[6] Cannon A J. Quantile regression neural networks:Implementation in R and application to precipitation downscaling. Computers and Geosciences, 2011, 37(9):1277-1284.
[7] Breiman L, Friedman J H, Olshen R A, et al. Classification and regression trees. Wadsworth Biometrics, 1984, 40(3):358.
[8] Meinshausen N. Quantile regression forests. Journal of Machine Learning Research, 2006, 7(35):983-999.
[9] Breiman L. Random forests. Machine Learning, 2001, 45:5-32.
[10] Zheng S. QBoost:Predicting quantiles with boosting for regression and binary classification. Expert Systems with Applications, 2012, 39(2):1687-1697.
[11] Yuan S. Random gradient boosting for predicting conditional quantiles. Journal of Statistical Computation and Simulation, 2015, 85(18):3716-3726.
[12] Friedman J H. Greedy function approximation:A gradient boosting machine. Annals of Statistics, 2001, 29(5):1189-1232.
[13] 苟小菊,王芊.基于分位数回归森林的VaR估计及风险因素分析.中国科学技术大学学报, 2019, 49(8):635-644.(Gou X J, Wang Q. VaR estimation based on quantile regression forest and risk factors analysis. Journal of University of Science and Technology of China, 2019, 49(8):635-644.)
[14] Diaz G, Coto J, Gomez-Aleixandre J. Prediction and explanation of the formation of the Spanish day-ahead electricity price through machine learning regression. Applied Energy, 2019, 239:610-625.
[15] Cordoba M, Carranza J P, Piumetto M, et al. A spatially based quantile regression forest model for mapping rural land values. Journal of Environmental Management, 2021, 289:112509.
[16] Elith J, Leathwick J R, Hastie T. A working guide to boosted regression trees. Journal of Animal Ecology, 2008, 77(4):802-813.
[17] Yang Y, Zou H. Nonparametric multiple expectile regression via ER-Boost. Journal of Statistical Computation and Simulation, 2015, 85(7):1442-1458.
[18] Jorion P. Value at Risk:The New Benchmark for Managing Financial Risk. New York:McGrawHill, 2000.
[19] 刘亭,赵月旭.基于QR-t-GARCH (1, 1)模型沪深指数收益率风险度量的研究.数理统计与管理, 2018, 37(3):533-543.(Liu T, Zhao Y X. Research on risk measurement of shanghai and shenzhen Index Yield Based on QR-t-GARCH (1, 1) model. Journal of Applied Statistics and Management, 2018, 37(3):533-543.)
[20] Merloa L, Petrella L, Raponi V. Forecasting VaR and ES using a joint quantile regression and its implications in portfolio allocation. Journal of Banking&Finance, 2021, 133(7):106248.
[21] 许启发,李辉艳,蒋翠侠,等.基于QRNN+GARCH方法的供应链金融多期价格风险测度及防范.数理统计与管理, 2018,37(4):728-740.(Xu Q F, Li H Y, Jiang C X, et al. Evaluating and preventing multi-period price risks in supply chain finance via QRNN+GARCH method. Journal of Applied Statistics and Management, 2018, 37(4):728-740.)
[22] Xu Q, Jiang C, He Y. An exponentially weighted quantile regression via SVM with application to estimating multiperiod VaR. Statistical Methods and Applications, 2016, 25(2):285-320.
[23] Kupiec P H. Techniques for verifying the accuracy of risk measurement models. Journal of Derivatives, 1995, 3(4):73-84.
[24] Christoffersen P F. Evaluating interval forecasts. International Economic Review, 1998, 39(4):841-862.
[1] 胡雪梅, 李佳丽, 蒋慧凤. 机器学习方法研究肝癌预测问题[J]. 系统科学与数学, 2022, 42(2): 417-433.
[2] 吕晓玲,王小宁,孙志猛. 删失分位数变系数回归模型的~FIC 模型平均估计[J]. 系统科学与数学, 2018, 38(7): 746-763.
[3] 王江峰,范国良,温利民. 删失指标随机缺失下回归函数的复合分位数回归估计[J]. 系统科学与数学, 2018, 38(11): 1347-1362.
[4] 余平,杜江,张忠占. 部分函数型线性可加分位数回归模型[J]. 系统科学与数学, 2017, 37(5): 1335-1350.
[5] 赵亮,杨战平. 一种针对实验数据的裕量和不确定性量化方法[J]. 系统科学与数学, 2016, 36(8): 1138-1149.
阅读次数
全文


摘要