2019
DOI: 10.1111/rssa.12534
|View full text |Cite
|
Sign up to set email alerts
|

Inferring the Outcomes of Rejected Loans: An Application of Semisupervised Clustering

Abstract: Summary Rejection inference aims to reduce sample bias and to improve model performance in credit scoring. We propose a semisupervised clustering approach as a new rejection inference technique. K‐prototype clustering can deal with mixed types of numeric and categorical characteristics, which are common in consumer credit data. We identify homogeneous acceptances and rejections and assign labels to part of the rejections according to the label of acceptances. We test the performance of various rejection infere… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 52 publications
0
6
0
Order By: Relevance
“…The final training set sizes among different methods in Table 2 also implies the need for fine-grained selection for inferred training set as discussed in other literature (Li et al 2020). Introducing more pseudo-labels into training set does not guarantee better classification performance.…”
Section: Experiments Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…The final training set sizes among different methods in Table 2 also implies the need for fine-grained selection for inferred training set as discussed in other literature (Li et al 2020). Introducing more pseudo-labels into training set does not guarantee better classification performance.…”
Section: Experiments Resultsmentioning
confidence: 99%
“…They combine accepted applicants with their repayment and rejected applicants with estimated performance into inferred datasets and generate reject inference scoring models. Recent ML based works have proposed new models to assign labels to the rejects from the angle of semisupervised learning, such as semi-supervised SVMs, selflearning, and K-prototype clustering (Li et al 2017(Li et al , 2020Kozodoi et al 2019). One common practice used in the industry is to obtain external loan performance data from credit bureaus for rejected applicants, though it is relatively costly.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…We would like to explore such methods in order to accommodate for the low ratio of labeled claims, which is an inherent problem we faced in our research. Such an approach was developed among others for rejected loan applications by Li, Hu, Li, Zhou, and Shen (2020) and could potentially help in finding the rare fraudulent claims in our case. Next, graph‐based anomaly detection methods would be an interesting approach to find the rare fraudulent claims in the network.…”
Section: Discussionmentioning
confidence: 99%
“…In contrast, a large literature advocates for various "reject inference" procedures that incorporate information from rejected applications in model construction. Reject inference procedures involve a class of methods for training a risk assessment on imputed outcomes for all applicants using augmentation, reweighing and extrapolation-based approaches [36,37]. There is considerable debate as to the performance improvements of reject inference over simply training a model on the observed population, but reject inference methods remain popular in credit scoring settings [38,39,32,33].…”
Section: Selective Labels and Reject Inferencementioning
confidence: 99%