Credit applicants are assigned to good or bad risk classes according to their record of defaulting. Each applicant is described by a high-dimensional input vector of situational characteristics and by an associated class label. A statistical model, which maps the inputs to the labels, can decide whether a new credit applicant should be accepted or rejected, by predicting the class label given the new inputs. Support vector machines (SVM) from statistical learning theory can build such models from the data, requiring extremely weak prior assumptions about the model structure. Furthermore, SVM divide a set of labelled credit applicants into subsets of 'typical' and 'critical' patterns. The correct class label of a typical pattern is usually very easy to predict, even with linear classification methods. Such patterns do not contain much information about the classification boundary. The critical patterns (the support vectors) contain the less trivial training examples. For instance, linear discriminant analysis with prior training subset selection via SVM also leads to improved generalization. Using non-linear SVM, more 'surprising' critical regions may be detected, but owing to the relative sparseness of the data, this potential seems to be limited in credit scoring practice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.