Generalized linear models are routinely used in many environment statistics problems such as earthquake magnitudes prediction. Hu et al. proposed Pareto regression with spatial random effects for earthquake magnitudes. In this paper, we propose Bayesian spatial variable selection for Pareto regression based on Bradley et al. and Hu et al. to tackle variable selection issue in generalized linear regression models with spatial random effects. A Bayesian hierarchical latent multivariate log gamma model framework is applied to account for spatial random effects to capture spatial dependence. We use two Bayesian model assessment criteria for variable selection including Conditional Predictive Ordinate (CPO) and Deviance Information Criterion (DIC). Furthermore, we show that these two Bayesian criteria have analytic connections with conditional AIC under the linear mixed model setting. We examine empirical performance of the proposed method via a simulation study and further demonstrate the applicability of the proposed method in an analysis of the earthquake data obtained from the United States Geological Survey (USGS).
In this paper, we develop a group learning approach to analyze the underlying heterogeneity structure of shot selection among professional basketball players in the NBA. We propose a mixture of finite mixtures (MFM) model to capture the heterogeneity of shot selection among different players based on the Log Gaussian Cox process (LGCP). Our proposed method can simultaneously estimate the number of groups and group configurations. An efficient Markov Chain Monte Carlo (MCMC) algorithm is developed for our proposed model. Simulation studies have been conducted to demonstrate its performance. Finally, our proposed learning approach is further illustrated in analyzing shot charts of selected players in the NBA's 2017-2018 regular season. KEYWORDSbasketball shot charts, heterogeneity pursuit, log gaussian cox process, mixture of finite mixtures, nonparameteric bayesian INTRODUCTIONIn basketball data analytics, one primary problem of research interest is to study how players choose the locations to make shots. Shot charts, which are graphical representations of players' shot location selections, provide important summary of information for basketball coaches as well as teams' data analysts, as no good defense strategies can be made without understanding the shot selection habits of players in the rival teams.Shot selection data have been discussed from different statistical perspectives. Reich, Hodges, Carlin, and Reich (2006) developed a spatially varying coefficients model for shot-chart data, where the court is divided into small regions and the probability of making a shot in these zones is modeled using the multinomial logit approach. Recognizing the random nature of shot location selection, Miller et al. (2014) analyzed the underlying spatial structure among professional basketball players based on spatial point processes. Franks, Miller, Bornn, and Goldsberry (2015) combined spatial and spatio-temporal processes, matrix factorization techniques, and hierarchical regression models for characterizing the spatial structure of locations for shot attempts. In spatial point processes, locations for points are assumed random and are regarded as realizations of a process governed by an underlying intensity. Spatial point processes are well discussed in many statistical literatures, such as the Poisson process (Geyer, 1998), the Gibbs process (Goulard, Särkkä, & Grabarnik, 1996), and the Log Gaussian Cox process (LGCP Møller, Syversveen, & Waagepetersen, 1998). In addition, they have been applied to different areas, such as ecological studies (Jiao, Hu, & Yan, 2020;Thurman, Fu,
Spatial regression models are ubiquitous in many different areas such as environmental science, geoscience, and public health. Exploring relationships between response variables and covariates with complex spatial patterns is a very important work. In this paper, we propose a novel spatially clustered coefficients regression model for count value data based on nonparametric Bayesian methods. Our proposed method detects the spatial homogeneity of the Poisson regression coefficients. A Markov random field constraint mixture of finite mixtures prior provides a consistent estimator of the number of the clusters of regression coefficients with the geographically neighborhood information. The theoretical properties of our proposed method are established. An efficient Markov chain Monte Carlo algorithm is developed by using multivariate log gamma distribution as a base distribution. Extensive simulation studies are carried out to examine empirical performance of the proposed method. Additionally, we analyze Georgia premature deaths data as an illustration of the effectiveness of our approach.
Although basketball is a dynamic process sport, played between two sides of five players each, learning some static information is essential for professional players, coaches, and team managers. In order to have a deep understanding of field goal attempts among different players, we propose a zero-inflated Poisson model with clustered regression coefficients to learn the shooting habits of different players over the court and the heterogeneity among them. Specifically, the zero-inflated model captures a large portion of the court with zero field goal attempts, and the mixture of finite mixtures model captures the heterogeneity among different players based on clustered regression coefficients and inflated probabilities. Both theoretical and empirical justification through simulation studies validate our proposed method. We apply our proposed model to data from the National Basketball Association (NBA), for learning players' shooting habits and heterogeneity among different players over the 2017-2018 regular season. This illustrates our model as a way of providing insights from different aspects.
We propose a method of spatial prediction using count data that can be reasonably modeled assuming the Conway-Maxwell Poisson distribution (COM-Poisson). The COM-Poisson model is a two parameter generalization of the Poisson distribution that allows for the flexibility needed to model count data that are either over or under-dispersed. The computationally limiting factor of the COM-Poisson distribution is that the likelihood function contains multiple intractable normalizing constants and is not always feasible when using Markov Chain Monte Carlo (MCMC) techniques. Thus, we develop a prior distribution of the parameters associated with the COM-Poisson that avoids the intractable normalizing constant. Also, allowing for spatial random effects induces additional variability that makes it unclear if a spatially correlated Conway-Maxwell Poisson random variable is over or under-dispersed. We propose a computationally efficient hierarchical Bayesian model that addresses these issues. In particular, in our model, the parameters associated with the COM-Poisson do not include spatial random effects (leading to additional variability that changes the dispersion properties of the data), and are then spatially smoothed in subsequent levels of the Bayesian hierarchical model. Furthermore, the spatially smoothed parameters have a simple regression interpretation that facilitates computation. We demonstrate the applicability of our approach using simulated examples, and a motivating application using 2016 US presidential election voting data in the state of Florida obtained from the Florida Division of Elections.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.