2020
DOI: 10.1016/j.knosys.2020.105758
|View full text |Cite
|
Sign up to set email alerts
|

Deep generative models for reject inference in credit scoring

Abstract: Credit scoring models based on accepted applications may be biased and their consequences can have a statistical and economic impact. Reject inference is the process of attempting to infer the creditworthiness status of the rejected applications. In this research, we use deep generative models to develop two new semi-supervised Bayesian models for reject inference in credit scoring, in which we model the data generating process to be dependent on a Gaussian mixture. The goal is to improve the classification ac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
17
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 37 publications
(26 citation statements)
references
References 48 publications
0
17
0
Order By: Relevance
“…The HELOC Dataset is used by the benchmark paper [25] and therefore used for fair comparison, whilst the LC Dataset is quite popular in the credit scoring literature [28,28,30,31] and is therefore used to further evaluate the implemented model. Figure 2 depicts the different stages undertaken during the preprocessing phase until the data is ready to be used by the classification model.…”
Section: Data Handling and Preprocessingmentioning
confidence: 99%
“…The HELOC Dataset is used by the benchmark paper [25] and therefore used for fair comparison, whilst the LC Dataset is quite popular in the credit scoring literature [28,28,30,31] and is therefore used to further evaluate the implemented model. Figure 2 depicts the different stages undertaken during the preprocessing phase until the data is ready to be used by the classification model.…”
Section: Data Handling and Preprocessingmentioning
confidence: 99%
“…Recent papers (see [12], [10], [1], [14], [16], [21], [26]) proposed reject inference techniques for other models than the usual logistic regression on which we focus here. Some of them can be cast into the general framework we introduce in Section 2.4: we elaborate on these in Section 3.8.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Ultimately, these methods bear the same two major flaws the usual (logistic regression-based) ones do: they are heuristics with implicit hypotheses and without theoretical guarantees; they cannot be empirically evaluated either since experiments always rely on biased samples. In [16], a generative model, similar to the one we use in Section 4.1 but with a richer hypothesis space, is used. These types of models, by estimating the joint distribution p(x, y) can be straightforwardly applied to partially-labeled data, but require stronger hypotheses on the data generating mechanism which can lead to worse results than discriminative models, as we show.…”
Section: Literature Reviewmentioning
confidence: 99%
“…To circumvent this issue, one might use more complex models, such as deep artificial neural networks [ 37 ], that capture the nonlinearities in the data [ 38 ]. More so, because in credit scoring problems, models needs to deal with imbalanced data sets [ 39 , 40 ] as the number of defaulting loans in a financial institution portfolio is much lower than non-defaulted [ 41 ] or issued [ 42 ] ones.…”
Section: Related Workmentioning
confidence: 99%