基于两步子抽样算法的P2P信用风险预测研究
Research on P2P Credit Risk Prediction Based on Two-Step Subsampling Algorithm
随着大数据时代的到来, P2P网络借贷的数据规模日 益庞大, 导致P2P 网络借贷信用风险比传统的金融借贷信用风险更 加难以预测, 使得大量的P2P机构面临倒闭.文章运用美国Lending Club网站2017--2018年的数据, 采取两步子抽样方法抽取样本, 建立logistic回归模型对P2P网络借贷信用风险进行预测. 研究结果表明: P2P网络借贷信用风险与借款人的年收入、FICO得分、贷款金额等多种因素有关; 基于两步子抽样方法建立的logistic回归模型在P2P网络借贷信用风险预测方面优于基于简单随机抽样方法建立的logistics 回归模型.
With the advent of the era of big data, the scale of data on P2P network lending has become increasingly large, which has made P2P network lending credit risk more difficult to predict than traditional financial credit risk and caused a large number of P2P institutions to fail. This paper uses the data from 2017 to 2018 of Lending Club website in the United States, adopts the two-step subsampling method to extract samples, and then establishes a logistic regression model to predict online peer-to-peer lending credit risk. The results show that: P2P online loan credit risk is related to the borrower's annual income, FICO score, loan amount and other factors. What's more, the logistic regression model based on the two-step subsampling method is better than based on simple random sampling in predicting online peer-to-peer lending credit risk.
P2P网络借贷 / 信用风险 / 大数据 / 两步子抽样方法 / logistic回归. {{custom_keyword}} /
/
〈 | 〉 |