• • 上一篇    

基于SMOTETomek-RFE-MLP算法的上市公司信用风险预测

卢哲1,2, 张健1,3   

  1. 1. 北京信息科技大学经济管理学院, 北京 100192;
    2. 智能决策与大数据应用北京市国际科技合作基地, 北京 100192;
    3. 绿色发展大数据决策北京市重点实验室, 北京 100192
  • 收稿日期:2022-05-10 修回日期:2022-07-01 发布日期:2022-11-04
  • 通讯作者: 张健,Email:zhangjian@bistu.edu.cn.
  • 基金资助:
    国家重点研发计划课题(2019YFB1405303),国家自然科学基金重点项目(71932002)资助课题.

卢哲, 张健. 基于SMOTETomek-RFE-MLP算法的上市公司信用风险预测[J]. 系统科学与数学, 2022, 42(10): 2712-2726.

LU Zhe, ZHANG Jian. Credit Risk Prediction of the Listed Companies Based on SMOTETomek-RFE-MLP Algorithm[J]. Journal of Systems Science and Mathematical Sciences, 2022, 42(10): 2712-2726.

Credit Risk Prediction of the Listed Companies Based on SMOTETomek-RFE-MLP Algorithm

LU Zhe1,2, ZHANG Jian1,3   

  1. 1. School of Economics and Management, Beijing Information Science and Technology University, Beijing 100192;
    2. Intelligent Decision and Big Data Application Beijing International Science and Technology Cooperation Base, Beijing 100192;
    3. Beijing Key Lab of Green Development Decision Based on Big Data, Beijing 100192
  • Received:2022-05-10 Revised:2022-07-01 Published:2022-11-04
准确把握上市公司的信用风险状态对监管者和银行等金融机构意义重大,融合财务指标与非财务指标构建信用风险预测指标集,文章提出了一种上市公司信用风险预测组合算法SMOTETomek-RFE-MLP.SMOTETomek混合采样算法以少数类样本过采样、多数类样本欠采样的方式解决样本分类不平衡问题;递归特征消除(Recursive feature elimination,RFE)算法将特征逐个加入模型,以分类精度为标准筛选出最优特征组合;多层感知机(Multi-layer perceptron,MLP)作为分类器实现上市公司信用风险预测.为验证算法的有效性,以2019年A股3797家上市公司的研究对象,设计基模型对比实验和消融实验进行算法测试.研究结果表明,SMOTETomek-RFE-MLP信用风险预测算法综合表现优于Adaboost等基线模型,解决了数据不平衡引起的分类紊乱和特征选择问题,对金融机构评估上市公司的违约风险具有一定的指导意义.
It is of great significance for regulators,banks and other financial institutions to accurately grasp the credit risk status of listed companies.In this paper,financial indicators and non-financial indicators are integrated to construct a set of credit risk prediction indicators,meanwhile,a combination algorithm SMOTETomekRFE-MLP is proposed for the credit risk prediction of listed companies.The hybrid sampling algorithm SMOTETomek solves the problem of unbalanced sample classification by over-sampling a few samples and under-sampling a majority of samples.Through adding features into the model one by one,recursive feature elimination (RFE) algorithm selects the optimal feature sets based on classification accuracy.Multi-layer Perceptron (MLP) is applied as the binary classifier to predict the credit risk of listed companies.To verify the effectiveness of the algorithm,the base model comparison experiment and ablation experiment are designed to test the algorithm with 3797 A-share listed companies in 2019 as the research object.The results show that SMOTETomek-RFE-MLP credit risk prediction algorithm outperforms the baseline models such as Adaboost,and solves the classification disorder and feature selection problems due to data imbalance,which has certain guiding significance for financial institutions to evaluate the default risk of listed companies.

MR(2010)主题分类: 

()
[1] Zamore S, Ohene Djan K, Alon I, et al. Credit risk research:Review and agenda. Emerging Markets Finance and Trade, 2018, 54(4):811-835.
[2] 冷洁,唐锡晋,闫志华,等.基于质量新闻的工业品质量风险分析.系统科学与数学, 2021, 41(12):3405-3421.(Leng J, Tang X J, Yan Z H, et al. Leverage quality news to analyze quality risk of industry products. Journal of Systems Science and Mathematical Sciences, 2021, 41(12):3405-3421.)
[3] Du Y, Wang C. Analysis of spatial variation of credit risk of China listed companies based on spatially varying coefficient logistic models. MATEC Web of Conferences. EDP Sciences, 2021, 336:09013.
[4] 迟国泰,章彤,张志鹏.基于非平衡数据处理的上市公司ST预警混合模型.管理评论, 2020, 32(3):3-20.(Chi G T, Zhang T, Zhang Z P. Special treatment warning hybrid model dealing with imbalanced data of Chinese listed companies. Management Review, 2020, 32(3):3-20.)
[5] 张田华,罗康洋.基于集成学习的上市公司高送转预测实证研究.计算机工程与应用:1-7[2022-04-19]. http://kns.cnki.net/kcms/detail/11.2127.tp.20210413.1610.020.html (Zhang T H, Luo K Y. Empirical study on forecast of large stock dividends of listed companies based on integrated learning. Computer Engineering and Applications:1-7[2022-04-19]. http://kns.cnki.net/kcms/detail/11.2127.tp.20210413.1610.020.html)
[6] 李艳霞,柴毅,胡友强,等.不平衡数据分类方法综述.控制与决策, 2019, 34(4):673-688.(Li Y X, Chai Y, Hu Y Q, et al. Review of imbalanced data classification methods. Control and Decision Making, 2019, 34(4):673-688.)
[7] 王乐,韩萌,李小娟,等.不平衡数据集分类方法综述.计算机工程与应用, 2021, 57(22):42-52.(Wang L, Han M, Li X J, et al. Review of classification methods for unbalanced data sets. Computer Engineering and Applications, 2021, 57(22):42-52.)
[8] 邱泽国,贺百艳.机器学习算法下信用风险评估体系构建研究——基于中国银联数据的个人信用风险评价分析.价格理论与实践, 2021,(10):89-92, 194.(Qiu Z G, He B Y. Research on the construction of credit risk evaluation system based on machine learning algorithm-Evaluation and analysis of personal credit risk based on China Union Pay data. Price Theory and Practice, 2021,(10):89-92, 194.)
[9] Yao G, Hu X, Wang G. A novel ensemble feature selection method by integrating multiple ranking information combined with an SVM ensemble model for enterprise credit risk prediction in the supply chain. Expert Systems with Applications, 2022, 200:117002.
[10] Rtayli N, Enneya N. Enhanced credit card fraud detection based on SVM-recursive feature elimination and hyper-parameters optimization. Journal of Information Security and Applications, 2020, 55:102596.
[11] Zhou Y, Uddin M S, Habib T, et al. Feature selection in credit risk modeling:An international evidence. Economic Research-Ekonomska Istraživanja, 2020, 1-31.
[12] Levy A, Baha R. Credit risk assessment:A comparison of the performances of the linear discriminant analysis and the logistic regression. International Journal of Entrepreneurship and Small Business, 2021, 42(1-2):169-186.
[13] 杜梅慧,李莉莉,张璇.基于两步子抽样算法的P2P信用风险预测研究.系统科学与数学, 2021, 41(2):566-576.(Du M H, Li L L, Zhang X. Research on P2P credit risk prediction based on two-step subsampling algorithm. Journal of Systems Science and Mathematical Sciences, 2021, 41(2):566-576.)
[14] 熊熊,马佳,赵文杰,等.供应链金融模式下的信用风险评价.南开管理评论, 2009, 12(4):92-98, 106.(Xiong X, Ma J, Zhao W J, et al. Credit risk analysis of supply chain finance. Nankai Management Review, 2009, 12(4):92-98, 106.)
[15] 蒋彧,高瑜.基于KMV模型的中国上市公司信用风险评估研究.中央财经大学学报, 2015, 337(9):38-45.(Jiang Y, Gao Y. Credit risk evaluations of Chinese listed companies using KMV model. Journal of Central University of Finance and Economics, 2015, 337(9):38-45.)
[16] 莫赞,张灿凤,魏伟,等.基于Bagging集成的个人信用风险评估方法研究.系统工程, 2019, 37(1):143-151.(Mo Z, Zhang C F, Wei W, et al. Research on personal credit risk assessment method based on bagging integration. Systems Engineering, 2019, 37(1):143-151.)
[17] 胡海青,张琅,张道宏,等.基于支持向量机的供应链金融信用风险评估研究.软科学, 2011, 25(5):26-30, 36.(Hu H Q, Zhang L, Zhang D H, et al. Research on finance credit risk assessment of supply chain based on SVM. Soft Science, 2011, 25(5):26-30, 36.)
[18] 王重仁,韩冬梅.基于超参数优化和集成学习的互联网信贷个人信用评估.统计与决策, 2019, 35(1):87-91.(Wang C R, Han D M. Personal credit evaluation of internet credit based on hyperparameter optimization and ensemble learning. Statistics and Decision Making, 2019, 35(1):87-91.)
[19] 郭文伟,陈泽鹏,钟明.基于MLP神经网络构建小企业信用风险预警模型.财会月刊, 2013, 658(6):22-26.(Guo W W, Chen Z P, Zhong M. A credit risk early warning model for small enterprises is constructed based on MLP neural network. The Accounting Issue, 2013, 658(6):22-26.)
[20] 陈学彬,武靖,徐明东.我国信用债个体违约风险测度与防范——基于LSTM深度学习模型.复旦学报(社会科学版), 2021, 63(3):159-173.(Chen X B, Wu J, Xu M D. Individual default risk measurement and prevention of China's credit bonds-Based on LSTM deep learning model. Journal of Fudan University (Social Science Edition), 2021, 63(3):159-173.)
[21] Liu J, Zhang S, Fan H. A two-stage hybrid credit risk prediction model based on XGBoost and graph-based deep neural network. Expert Systems with Applications, 2022, 195:116624.
[22] Yao G, Hu X, Wang G. A novel ensemble feature selection method by integrating multiple ranking information combined with an SVM ensemble model for enterprise credit risk prediction in the supply chain. Expert Systems with Applications, 2022, 200:117002.
[23] Zhao Z, Xu S, Kang B H, et al. Investigation and improvement of multi-layer perceptron neural networks for credit scoring. Expert Systems with Applications, 2015, 42(7):3508-3516.
[24] Zhang W, Yan S, Li J, et al. Credit risk prediction of SMEs in supply chain finance by fusing demographic and behavioral data. Transportation Research Part E:Logistics and Transportation Review, 2022, 158:102611.
[25] GBT 23794-2015企业信用评价指标.(GBT 23794-2015, Enterprise credit evaluation index, China, 2015.)
[26] 张发明,李艾珉,韩媛媛.基于改进动态组合评价方法的小微企业信用评价研究.管理学报, 2019, 16(2):286-296.(Zhang F M, Li A M, Han Y Y. Study on small and micro businesses credit assessment based on improved dynamic combinedevaluation method. Journal of Management, 2019, 16(2):286-296.)
[27] Talyatovna R I. Service enterprise reputation management. Journal of Critical Reviews, 2020, 7(4):196-199.
[28] Fu L. Research on the internal control of guarantee company. Proceedings of the 2016 International Conference on Modern Management, Education Technology, and Social Science (Mmetss 2016):Atlantis Press, 2017, 203-206.
[29] 沈萍,刘子嘉.企业社会责任报告与公开债务融资.东北财经大学学报, 2021, 136(4):68-77.(Shen P, Liu Z J. Corporate social responsibility report and public debt financing. Journal of Dongbei University of Finance and Economics, 2021, 136(4):68-77.)
[1] 贺加贝,李新民. 不平衡单因素随机效应模型下过程无能力指数的区间估计[J]. 系统科学与数学, 2020, 40(2): 281-288.
阅读次数
全文


摘要