信用卡违约预测模型分析以及影响因素探究
Study on Analysis and Influence Factors of Credit Card Default Prediction Model
DOI: 10.12677/SA.2016.53026, PDF, HTML, XML,  被引量 下载: 3,171  浏览: 7,863 
作者: 梅瑞婷, 徐扬, 王国长*:暨南大学经济学院,广东 广州
关键词: 信用卡信用风险随机森林Lasso-Logistic模型Credit Card Credit Risk Random Forest Lasso-Logistic Model
摘要: 信用卡对于银行来说是高收益和高风险并存的业务,伴随信用卡业务发展的是各大银行都在利用网络和移动端的数据来建立客户的信用评分系统。如何从客户所填的资料里对客户进行信用评估、如何鉴别所填资料的真假性及应该要求客户填什么类型的资料等对银行来说是至关重要的。本文基于2005年台湾信用卡客户数据,建立Lasso-Logistic及随机森林模型来探索影响客户信用的关键因素,包括个体特征及某些客观特征,通过比较模型的预测准确度以及F得分等指标来选择预测效果更优的模型对银行信用卡违约进行预测分析。信用卡违约预测模型的建立以及影响客户信用的关键因素的探索,对于银行选择客户和设计资料填写具有重要的指导价值,并且能够为信贷决策提供一定的理论支持,具有很强的理论和现实意义。
Abstract: Credit cards are a bank business in which high income and heavy risk coexist. Along with the de-velopment of the credit card business, banks are using the Internet and mobile data to establish customer credit rating system. How to evaluate customer credit from the information that cus-tomers fill in, and how to identify the information true or false, and what type of information that customers are asked to fill are crucial for banks. Based on the credit card customer data of 2005 in Taiwan, this article established Lasso-Logistic model and random forest model to explore the key factors which effect customer credit, including individual characteristics and some objective cha-racteristics. Through comparing the prediction accuracy of the model and F score index, we selected the model of better prediction effect to forecast the bank credit card defaults. The establishment of the credit card default prediction model and the exploration of the key factors influencing the customer credit not only have a important guidance value for banks to choose customers and design data, but also can provide certain theoretical support for the credit decisions. In addition, it has a strong theoretical and practical significance.
文章引用:梅瑞婷, 徐扬, 王国长. 信用卡违约预测模型分析以及影响因素探究[J]. 统计学与应用, 2016, 5(3): 263-275. http://dx.doi.org/10.12677/SA.2016.53026

参考文献

[1] 聂雨. 基于数据挖掘的信用卡个人客户信用评价研究[D]: [硕士学位论文]. 西安: 西安科技大学, 2012.
[2] 崔萌. 基于CPV模型和压力测试的我国商业银行信用风险研究[D]: [硕士学位论文]. 长春: 吉林大学, 2013.
[3] 高嘉晔. 信用卡违约风险影响因素实证研究[D]: [硕士学位论文]. 大连: 东北财经大学, 2014.
[4] 朱醒亮, 王佳, 葛姣菊. 基于Probit模型对消费者信用卡还贷影响因素的实证分析[J]. 消费经济, 2013(4): 48-51.
[5] 徐少锋, 王延臣. 个人信用评估中的Logistic模型[J]. 天津科技大学学报, 2003(12): 46-49.
[6] 石庆焱. 一个基于神经网络——Logistic回归的混合两阶段个人信用评分模型研究[J]. 统计研究, 2005(5): 45-49.
[7] 方匡南, 章贵军, 张惠颖. 基于Lasso-Logistic模型的个人信用风险预警方法[J]. 数量经济技术经济研究, 2014(2): 125-136.
[8] 周丽峰. 基于非平衡数据分类的贷款违约预测研究[D]: [硕士学位论文]. 长沙: 中南大学, 2013.
[9] 佚文. 机器学习中的算法(1)——决策树模型组合之随机森林与GBDT [DB/OL]. http://www.cnblogs.com/LeftNotEasy/archive/2011/03/07/random-forest-and-gbdt.html, 2011.
[10] 李伯韬. Spark随机深林扩展——OOB错误评估和变量权重[DB/OL]. http://www.cnblogs.com/bourneli/p/4536778.html, 2015.
[11] 佚文. 不平衡数据下的机器学习方法简介[DB/OL]. http://www.jianshu.com/p/3e8b9f2764c8, 2015.
[12] Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P. (2002) SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357.
[13] 张晓蕾. 信用卡分期业务违约的影响因素及研究[D]: [硕士学位论文]. 广州: 暨南大学, 2014.