This paper presents first steps toward robust models for crisis prediction. We conduct a horse race of conventional statistical methods and more recent machine learning methods as early-warning models. As individual models are in the literature most often built in isolation of other methods, the exercise is of high relevance for assessing the relative performance of a wide variety of methods. Further, we test various ensemble approaches to aggregating the information products of the built models, providing a more robust basis for measuring country-level vulnerabilities. Finally, we provide approaches to estimating model uncertainty in early-warning exercises, particularly model performance uncertainty and model output uncertainty. The approaches put forward in this paper are shown with Europe as a playground. Generally, our results show that the conventional statistical approaches are outperformed by more advanced machine learning methods, such as k-nearest neighbors and neural networks, and particularly by model aggregation approaches through ensemble learning.Keywords: financial stability, early-warning models, horse race, ensembles, model uncertainty JEL codes: E440, F300, G010, G150, C430 $ We are grateful to Johannes Beutel, Andras Fulop, Benjamin Klaus, Jan-Hannes Lang, Tuomas A. Peltonen, Roberto Savona, Gregor von Schweinitz, Eero Tölö, Peter Welz and Marika Vezzoli for useful comments on previous versions of the paper. The paper has also benefited from comments during presentations at BITA'14 Seminar on Current Topics in Business, IT
Non-technical summaryThe repeated occurrence of financial crises at the turn of the 21st century has stimulated theoretical and empirical work on the phenomenon, not least early-warning models. Yet, the history of these models goes far back. Despite not always referring to macroprudential analysis, the early days of risk analysis relied on assessing financial ratios by hand rather than with advanced statistical methods on computers. During the 1960s, discriminant analysis emerged, being the most dominantly used technique until the 1980s. After the 1980s, DA has mainly been replaced by logit/probit models. Applications of these models range from early models for currency crises to recent ones on systemic financial crises. In parallel, the simple yet intuitive signal extraction approach that simply finds thresholds on individual indicators has gained popularity. With technological advances, a soar in data availability and a thriving need for progress in systemic risk identification, a new group of flexible and non-linear machine learning techniques have been introduced to various forms of financial stability surveillance. Recent literature indicates that these novel approaches hold promise for systemic risk identification because of their ability to identify and map complex dependencies. The premise of difference in performance relates to how methods treat two aspects: individual vs. multiple risk indicators and linear vs. non-linear relationships. While the simple...