Janice L. Scealy scite author profile

Linear mixed effects models are highly flexible in handling a broad range of data types and are therefore widely used in applications. A key part in the analysis of data is model selection, which often aims to choose a parsimonious model with other desirable properties from a possibly very large set of candidate statistical models. Over the last 5-10 years the literature on model selection in linear mixed models has grown extremely rapidly. The problem is much more complicated than in linear regression because selection on the covariance structure is not straightforward due to computational issues and boundary problems arising from positive semidefinite constraints on covariance matrices. To obtain a better understanding of the available methods, their properties and the relationships between them, we review a large body of literature on linear mixed model selection. We arrange, implement, discuss and compare model selection methods based on four major approaches: information criteria such as AIC or BIC, shrinkage methods based on penalized loss functions such as LASSO, the Fence procedure and Bayesian techniques.Comment: Published in at http://dx.doi.org/10.1214/12-STS410 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

show abstract

Regression for Compositional Data by using Distributions Defined on the Hypersphere

Scealy

Welsh

2011

View full text Add to dashboard Cite

Compositional data can be transformed to directional data by the square-root transformation and then modelled by using distributions defined on the hypersphere. One advantage of this approach is that zero components are catered for naturally in the models. The Kent distribution for directional data is a good candidate model because it has a sufficiently general covariance structure. We propose a new regression model which models the mean direction of the Kent distribution as a function of a vector of covariates. Our estimators can be regarded as asymptotic maximum likelihood estimators. We show that these estimators perform well and are suitable for typical compositional data sets, including those with some zero components.

show abstract

Robust Principal Component Analysis for Power Transformed Compositional Data

Scealy¹,

Caritat²,

Grunsky³

et al. 2015

Journal of the American Statistical Association

View full text Add to dashboard Cite

The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden.

show abstract

Colours and Cocktails: Compositional Data Analysis 2013 Lancaster Lecture

Scealy

Welsh

2014

Aus NZ J of Statistics

View full text Add to dashboard Cite

Summary The different constituents of physical mixtures such as coloured paint, cocktails, geological and other samples can be represented by d‐dimensional vectors called compositions with non‐negative components that sum to one. Data in which the observations are compositions are called compositional data. There are a number of different ways of thinking about and consequently analysing compositional data. The log‐ratio methods proposed by Aitchison in the 1980s have become the dominant methods in the field. One reason for this is the development of normative arguments converting the properties of log‐ratio methods to ‘essential requirements’ or Principles for any method of analysis to satisfy. We discuss different ways of thinking about compositional data and interpret the development of the Principles in terms of these different viewpoints. We illustrate the properties on which the Principles are based, focussing particularly on the key subcompositional coherence property. We show that this Principle is based on implicit assumptions and beliefs that do not always hold. Moreover, it is applied selectively because it is not actually satisfied by the log‐ratio methods it is intended to justify. This implies that a more open statistical approach to compositional data analysis should be adopted.

show abstract

Scaled von Mises–Fisher Distributions and Regression Models for Paleomagnetic Directional Data

Scealy

Wood

2019

Journal of the American Statistical Association

View full text Add to dashboard Cite

We propose a new distribution for analysing paleomagnetic directional data that is a novel transformation of the von Mises-Fisher distribution. The new distribution has ellipse-like symmetry, as does the Kent distribution; however, unlike the Kent distribution the normalising constant in the new density is easy to compute and estimation of the shape parameters is straightforward.To accommodate outliers, the model also incorporates an additional shape parameter which controls the tail-weight of the distribution. We also develop a general regression model framework that allows both the mean direction and the shape parameters of the error distribution to depend on covariates.The proposed regression procedure is shown to be equivariant with respect to the choice of coordinate system for the directional response. To illustrate, we analyse paleomagnetic directional data from the GEOMAGIA50.v3 database (Brown et al. 2015). We predict the mean direction at various geological 1 time points and show that there is significant heteroscedasticity present. It is envisaged that the regression structures and error distribution proposed here will also prove useful when covariate information is available with (i) other types of directional response data; and (ii) square-root transformed compositional data of general dimension.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Janice L. Scealy

Model Selection in Linear Mixed Models

Regression for Compositional Data by using Distributions Defined on the Hypersphere

Robust Principal Component Analysis for Power Transformed Compositional Data

Colours and Cocktails: Compositional Data Analysis 2013 Lancaster Lecture

Scaled von Mises–Fisher Distributions and Regression Models for Paleomagnetic Directional Data

Contact Info

Product

Resources

About