基于Stacking集成学习算法的个人信用评估模型
Personal Credit Assessment Model Based on Stacking Ensemble Learning Algorithm
DOI: 10.12677/SA.2017.64047, PDF, HTML, XML,  被引量 下载: 2,745  浏览: 6,569 
作者: 彭润泽:北京航空航天大学数学与系统科学学院,北京
关键词: 集成学习Stacking信用评估Ensemble Learning Stacking Credit Assessment
摘要: 传统机器学习算法的预测精度往往依赖于具体的问题,集成学习通过综合若干基分类器的预测结果,实现了分类效果的显著提升。对集成学习的思想进行了简单地介绍,阐述了Stacking集成相对于传统经典集成算法的优势。并基于Stacking集成框架,利用UCI的信用评估数据集,构建两层分类器学习模型对个人信用进行评估。实证分析的结果表明,相对于单一的机器学习方法,以及对这些单一机器学习方法的结果进行简单的平均集成,两层分类器的Stacking集成学习有着更好的预测效果。
Abstract: The prediction accuracy of traditional machine learning methods often depends on the specific problems. Ensemble learning achieves significant improvement in classification performance by combining several of base classifiers. This paper briefly introduces the basic idea of ensemble learning, discusses advantages of Stacking to the traditional classical ensemble algorithms. Based on the Stacking framework,we build two-layer classification model to evaluate the personal credit using the UCI datasets. The results of the empirical analysis show that, compared with the single machine learning method and simple average ensemble, Stacking with two-layer classifier has a better prediction effect.
文章引用:彭润泽. 基于Stacking集成学习算法的个人信用评估模型[J]. 统计学与应用, 2017, 6(4): 411-417. https://doi.org/10.12677/SA.2017.64047

参考文献

[1] Hand, D.J. and Henley, W.E. (1997) Statistical Classification Methods in Consumer Credit Scoring: A Review. Journal of the Royal Statistical Society: Series A (Statistics in Society), 160, 523-541.
https://doi.org/10.1111/j.1467-985X.1997.00078.x
[2] García, V., Marqués, A.I. and Sánchez, J.S. (2012) Non-Parametric Statistical Analysis of Machine Learning Methods for Credit Scoring. Management Intelligent Systems, 263-272.
https://doi.org/10.1007/978-3-642-30864-2_25
[3] Yeh, I.C. and Lien, C.H. (2009) The Comparisons of Data Mining Tech-niques for the Predictive Accuracy of Probability of Default of Credit Card Clients. Expert Systems with Applications, 36, 2473-2480.
https://doi.org/10.1016/j.eswa.2007.12.020
[4] 姜明辉, 谢行恒, 王树林, 等. 个人信用评估的Logistic-RBF组合模型[J]. 哈尔滨工业大学学报, 2007, 39(7): 1128-1130.
[5] 叶晓枫, 鲁亚会. 基于随机森林融合朴素贝叶斯的信用评估模型[J]. 数学的实践与认识, 2017(2): 68-73.
[6] Giudici, P. (2001) Bayesian Data Mining, with Application to Benchmarking and Credit Scoring. Applied Stochastic Models in Business & Industry, 17, 69-81.
https://doi.org/10.1002/asmb.425
[7] Lee, T.S., Chiu, C.C., Lu, C.J., et al. (2002) Credit Scoring Using the Hybrid Neural Discriminant Technique. Expert Systems with Applications, 23, 245-254.
https://doi.org/10.1016/S0957-4174(02)00044-1
[8] Baesens, B. (2003) Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation. Management Science, 49, 312-329.
https://doi.org/10.1287/mnsc.49.3.312.12739
[9] Stepanova, M. (2003) Benchmarking State-of-the-Art Classification Algo-rithms for Credit Scoring. Journal of the Operational Research Society, 54, 627-635.
https://doi.org/10.1057/palgrave.jors.2601545
[10] Farquad, M.A., Ravi, H., Sriramjee, V., et al. (2011) Credit Scoring Using PCA-SVM Hybrid Model. Communications in Computer & Information Science, 142, 249-253.
https://doi.org/10.1007/978-3-642-19542-6_40
[11] West, D., Dellana, S. and Qian, J. (2005) Neural Network Ensemble Strat-egies for Financial Decision Applications. Computers & Operations Research, 32, 2543-2559.
https://doi.org/10.1016/j.cor.2004.03.017
[12] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016: 171-173.
[13] Zenko, B., Todorovski, L. and Dzeroski, S. (2001) A Comparison of Stacking with Meta Decision Trees to Bagging, Boosting, and Stacking with other Methods. IEEE International Conference on Data Mining, 669-670.
https://doi.org/10.1109/ICDM.2001.989601
[14] Fawcett, T. (2006) An Introduction to ROC Analysis. Pattern Recognition Letters, 27, 861-874.
https://doi.org/10.1016/j.patrec.2005.10.010