Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
Logistic regression is a very popular binary classification technique in many industries, particularly in the financial service industry. It has been used to build credit scorecards, estimate the probability of default or churn, identify the next best product in marketing, and many more applications. The machine learning literature has recently introduced several alternative techniques, such as deep learning neural networks, random forests, and factorisation machines. While neural networks and random forests form part of the practitioner’s model-building toolkit, factorisation machines are seldom used. In this paper, we investigate the applicability of factorisation machines to some binary classification problems in banking. To stimulate the practical application of factorisation machines, we implement the fitting routines, based on logit loss and maximum likelihood, on commercially available software that is widely used by banks and other large financial services companies. Logit loss is usually used by the machine learning fraternity while maximum likelihood is popular in statistics. Depending on the coding of the target variable, we will show that these methods yield identical parameter estimates. Often, banks are confronted with predicting events that occur with low probability. To deal with this phenomenon, we introduce weights in the above-mentioned loss functions. The accuracy of our fitting algorithms is then studied by means of a simulation study and compared with logistic regression. The separation and prediction performance of factorisation machines are then compared to logistic regression and random forests by means of three case studies covering a recommender system, credit card fraud, and a credit scoring application. We conclude that logistic factorisation machines are worthy competitors of logistic regression in most applications, but with clear advantages in recommender systems applications where the number of predictors typically outnumbers the number of observations.
Logistic regression is a very popular binary classification technique in many industries, particularly in the financial service industry. It has been used to build credit scorecards, estimate the probability of default or churn, identify the next best product in marketing, and many more applications. The machine learning literature has recently introduced several alternative techniques, such as deep learning neural networks, random forests, and factorisation machines. While neural networks and random forests form part of the practitioner’s model-building toolkit, factorisation machines are seldom used. In this paper, we investigate the applicability of factorisation machines to some binary classification problems in banking. To stimulate the practical application of factorisation machines, we implement the fitting routines, based on logit loss and maximum likelihood, on commercially available software that is widely used by banks and other large financial services companies. Logit loss is usually used by the machine learning fraternity while maximum likelihood is popular in statistics. Depending on the coding of the target variable, we will show that these methods yield identical parameter estimates. Often, banks are confronted with predicting events that occur with low probability. To deal with this phenomenon, we introduce weights in the above-mentioned loss functions. The accuracy of our fitting algorithms is then studied by means of a simulation study and compared with logistic regression. The separation and prediction performance of factorisation machines are then compared to logistic regression and random forests by means of three case studies covering a recommender system, credit card fraud, and a credit scoring application. We conclude that logistic factorisation machines are worthy competitors of logistic regression in most applications, but with clear advantages in recommender systems applications where the number of predictors typically outnumbers the number of observations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.