The paper presents a forecasting model for association football scores. The model uses a Weibullinter-arrival times based count process and a copula to produce a bivariate distribution for the number of goals scored by the home and away teams in a match. We test it against a variety of alternatives, including the simpler Poisson distribution-based model and an independent version of our model. The out-of-sample performance of our methodology is illustrated first using calibration curves and then in a Kelly-type betting strategy that is applied to the pre-match win/draw/loss market and to the over-under 2.5 goals market. The new model provides an improved fit to data compared to previous models and results in positive returns to betting.
The paper presents a plus-minus rating for use in association football (soccer). We first describe the standard plus-minus methodology as used in basketball and ice-hockey and then adapt it for use in soccer. The usual goal-differential plus-minus is considered before two variations are proposed. For the first variation, we present a methodology to calculate an expected goals plus-minus rating. The second variation makes use of in-play probabilities of match outcome to evaluate an expected points plus-minus rating. We use the ratings to examine who are the best players in European football, and demonstrate how the players' ratings evolve over time. Finally, we shed light on the debate regarding which is the strongest league. The model suggests the English Premier League is the strongest, with the German Bundesliga a close runner-up. * task is to estimate time-varying ratings for individuals which update following new information (the latest results). Elo ratings have been used for over half a century for rating chess players. Similarly, the Glicko rating system (Glickman, 2012) provides a more theoretically justified model for estimating time-varying ratings of individuals.More recently, attention has moved to using machine learning techniques to estimate player ratings. The TrueSkill rating (Herbrich et al., 2007) developed at Microsoft is a generalisation of the Elo ratings and is used for rating video game players.Rating players in sports teams is more problematic. Players often have different responsibilities with some concentrated on offence (i.e. aiding scoring), whilst others are specialised in defence (i.e. helping to prevent scores for the opposition). A commonly used approach is to assign a value to a set of actions considered to be 'of interest' and to reward the player taking them with the associated value. This method was used for example in the EA SPORTS Player Performance Indicator (McHale et al., 2012) and is still used by the English Premier League as the official player ratings system. Due to its additivity, the previous approach provides simple, user-friendly player ratings and rankings. However, a cost of the simplicity is the lack of context and a deeper understanding of the situations in which actions were committed. Further, the data requirement is not trivial.Models have been used to rate players for specific tasks. For example, Sáez Castillo et al. (2013) and McHale and Szczepański (2014) present methods to identify the scoring ability of footballers whereas López Peña and Touchette (2012), López Peña and Sánchez Navarro (2015), Brooks et al. (2016) and Szczepański and McHale (2016) deal with the passing aspect.But identifying the overall contribution of a player to a team's success (or lack of it) has proven difficult in soccer. However, the concept of the PM ratings provides hope.The concept of the PM rating is fundamentally different to the rating mechanisms discussed above. It directly measures the contribution a player has on a team's success as measured by (the differential) of a ta...
Discrete distributions derived from renewal processes, i.e. distributions of the number of events by some time t are beginning to be used in econometrics and health sciences. A new fast method is presented for computation of the probabilities for these distributions. We calculate the count probabilities by repeatedly convolving the discretized distribution, and then correct them using Richardson extrapolation. When just one probability is required, a second algorithm is described, an adaptation of De Pril's method, in which the computation time does not depend on the ordinality, so that even high-order probabilities can be rapidly found. Any survival distribution can be used to model the inter-arrival times, which gives a rich class of models with great flexibility for modelling both underdispersed and overdispersed data. This work could pave the way for the routine use of these distributions as an additional tool for modelling event count data. An empirical example using fertility data illustrates the use of the method and was fully implemented using an R (R Core Team, 2015) package Countr (Baker et al., 2016) developed by the authors and available from the Comprehensive R Archive Network (CRAN).
A new alternative to the standard Poisson regression model for count data is suggested. This new family of models is based on discrete distributions derived from renewal processes, i.e., distributions of the number of events by some time t. Unlike the Poisson model, these models have, in general, time-dependent hazard functions. Any survival distribution can be used to describe the inter-arrival times between events, which gives a rich class of count processes with great flexibility for modelling both underdispersed and overdispersed data. The R package Countr provides a function, renewalCount(), for fitting renewal count regression models and methods for working with the fitted models. The interface is designed to mimic the glm() interface and standard methods for model exploration, diagnosis and prediction are implemented. Package Countr implements stateof-the-art recently developed methods for fast computation of the count probabilities. The package functionalities are illustrated using several datasets.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite Inc. All rights reserved.
Made with 💙 for researchers