Personal credit scoring is the application of financial risk forecasting. It becomes an even important task as financial institutions have been experiencing serious competition and challenges. In this paper, the techniques used for credit scoring are summarized and classified and the new method—ensemble learning model is introduced. This article also discusses some problems in current study. It points out that changing the focus from static credit scoring to dynamic behavioral scoring and maximizing revenue by decreasing the Type I and Type II error are two issues in current study. It also suggested that more complex models cannot always been applied to actual situation. Therefore, how to use the assessment models widely and improve the prediction accuracy is the main task for future research
In recent years, deep learning credit scoring models have become a hot research topic in Internet finance. However, most of the existing studies are based on deep neural network models, whose structure is difficult to design. Moreover, previous research seldom considers the impact of class imbalance problems on credit scoring performance. To fill this gap, we propose a new deep learning credit scoring model based on deep forest (DF) and resampling methods. First, we combine DF with five resampling methods including random over-sampling (ROS), random under-sampling (RUS), synthetic minority over-sampling technique (SMOTE), tomek links and SMOTE+ Tomek, respectively, to build responding models. We validate that the RUS-DF model has the best credit scoring performance among the above models. Then, to further evaluate the advantages of the deep ensemble model RUS-DF, we compare it with four models building by combining RUS with multilayer perceptron, convolutional neural network, and long short-term memory and random forests, respectively. All the experiments are conducted on four Internet financial credit scoring datasets. The results show that the RUS-DF model obtains better classification performance and stability than other models and is suitable for solving the credit scoring problem with imbalanced data.INDEX TERMS Credit scoring, class imbalance, deep forest, resampling method.
Summary
Coal measure gas is a research hotspot in recent years. And yet the complexity of source-reservoir relationships and the ambiguity of the gas/water interface in coal measure reservoirs bring challenges to the traditional gas identification methods. With the development of intelligent computing, machine learning has shown good development prospects in the field of oil and gas exploration and development. However, on the one hand, the more capable the learning algorithm is, the greater the demand for data; on the other hand, traditional learning methods suffer from difficulties in hyperparameter tuning and generalization improvement when learning samples are insufficient. To perform intelligent and reliable gas identification in the coal measure reservoir, an ensemble learning-based gas identification method was proposed. The method models a two-layer structure. The first layer consists of multiple models that were trained by different learning algorithms, such as k-nearest neighbor (kNN), decision tree (DT), neural network (NN), and support vector machine (SVM). While the second layer was used to relearn the output of the first layer, which was implemented by logistic regression (LR). We tested and practically applied this method to real data from a coal measure reservoir in Block A of the Ordos Basin, China. The experimental results showed that our method significantly improved the learning ability of the individual learners on the small sample and performed most consistently when the hyperparameter changes. Moreover, random forest (RF) and deep NN (DNN), as the comparison methods in practical applications, were slightly inferior to ours due to greater computational effort and lower robustness and prediction accuracy. This demonstrates the superiority of our method for fast and effective log-based gas identification, and also suggests that stacking has great potential that is not limited to gas identification tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.