Paul C. D. Johnson scite author profile

The coefficient of determination R2 quantifies the proportion of variance explained by a statistical model and is an important summary statistic of biological interest. However, estimating R2 for generalized linear mixed models (GLMMs) remains challenging. We have previously introduced a version of R2 that we called for Poisson and binomial GLMMs, but not for other distributional families. Similarly, we earlier discussed how to estimate intra-class correlation coefficients (ICCs) using Poisson and binomial GLMMs. In this paper, we generalize our methods to all other non-Gaussian distributions, in particular to negative binomial and gamma distributions that are commonly used for modelling biological data. While expanding our approach, we highlight two useful concepts for biologists, Jensen's inequality and the delta method, both of which help us in understanding the properties of GLMMs. Jensen's inequality has important implications for biologically meaningful interpretation of GLMMs, whereas the delta method allows a general derivation of variance associated with non-Gaussian distributions. We also discuss some special considerations for binomial GLMMs with binary or proportion data. We illustrate the implementation of our extension by worked examples from the field of ecology and evolution in the R environment. However, our method can be used across disciplines and regardless of statistical environments.

show abstract

Extension of Nakagawa & Schielzeth's R²_GLMM to random slopes models

Johnson

2014

Methods Ecol Evol

832

637

View full text Add to dashboard Cite

Nakagawa & Schielzeth extended the widely used goodness-of-fit statistic R2 to apply to generalized linear mixed models (GLMMs). However, their R2GLMM method is restricted to models with the simplest random effects structure, known as random intercepts models. It is not applicable to another common random effects structure, random slopes models.I show that R2GLMM can be extended to random slopes models using a simple formula that is straightforward to implement in statistical software. This extension substantially widens the potential application of R2GLMM.

show abstract

HMG-coenzyme A reductase inhibition, type 2 diabetes, and bodyweight: evidence from genetic analysis and randomised trials

et al. 2015

View full text Add to dashboard Cite

SummaryBackgroundStatins increase the risk of new-onset type 2 diabetes mellitus. We aimed to assess whether this increase in risk is a consequence of inhibition of 3-hydroxy-3-methylglutaryl-CoA reductase (HMGCR), the intended drug target.MethodsWe used single nucleotide polymorphisms in the HMGCR gene, rs17238484 (for the main analysis) and rs12916 (for a subsidiary analysis) as proxies for HMGCR inhibition by statins. We examined associations of these variants with plasma lipid, glucose, and insulin concentrations; bodyweight; waist circumference; and prevalent and incident type 2 diabetes. Study-specific effect estimates per copy of each LDL-lowering allele were pooled by meta-analysis. These findings were compared with a meta-analysis of new-onset type 2 diabetes and bodyweight change data from randomised trials of statin drugs. The effects of statins in each randomised trial were assessed using meta-analysis.FindingsData were available for up to 223 463 individuals from 43 genetic studies. Each additional rs17238484-G allele was associated with a mean 0·06 mmol/L (95% CI 0·05–0·07) lower LDL cholesterol and higher body weight (0·30 kg, 0·18–0·43), waist circumference (0·32 cm, 0·16–0·47), plasma insulin concentration (1·62%, 0·53–2·72), and plasma glucose concentration (0·23%, 0·02–0·44). The rs12916 SNP had similar effects on LDL cholesterol, bodyweight, and waist circumference. The rs17238484-G allele seemed to be associated with higher risk of type 2 diabetes (odds ratio [OR] per allele 1·02, 95% CI 1·00–1·05); the rs12916-T allele association was consistent (1·06, 1·03–1·09). In 129 170 individuals in randomised trials, statins lowered LDL cholesterol by 0·92 mmol/L (95% CI 0·18–1·67) at 1-year of follow-up, increased bodyweight by 0·24 kg (95% CI 0·10–0·38 in all trials; 0·33 kg, 95% CI 0·24–0·42 in placebo or standard care controlled trials and −0·15 kg, 95% CI −0·39 to 0·08 in intensive-dose vs moderate-dose trials) at a mean of 4·2 years (range 1·9–6·7) of follow-up, and increased the odds of new-onset type 2 diabetes (OR 1·12, 95% CI 1·06–1·18 in all trials; 1·11, 95% CI 1·03–1·20 in placebo or standard care controlled trials and 1·12, 95% CI 1·04–1·22 in intensive-dose vs moderate dose trials).InterpretationThe increased risk of type 2 diabetes noted with statins is at least partially explained by HMGCR inhibition.FundingThe funding sources are cited at the end of the paper.

show abstract

Virus–virus interactions impact the population dynamics of influenza and the common cold

Nickbakhsh

Mair

Matthews

et al. 2019

Proc. Natl. Acad. Sci. U.S.A.

367

302

View full text Add to dashboard Cite

The human respiratory tract hosts a diverse community of cocirculating viruses that are responsible for acute respiratory infections. This shared niche provides the opportunity for virus-virus interactions which have the potential to affect individual infection risks and in turn influence dynamics of infection at population scales. However, quantitative evidence for interactions has lacked suitable data and appropriate analytical tools. Here, we expose and quantify interactions among respiratory viruses using bespoke analyses of infection time series at the population scale and coinfections at the individual host scale. We analyzed diagnostic data from 44,230 cases of respiratory illness that were tested for 11 taxonomically broad groups of respiratory viruses over 9 y. Key to our analyses was accounting for alternative drivers of correlated infection frequency, such as age and seasonal dependencies in infection risk, allowing us to obtain strong support for the existence of negative interactions between influenza and noninfluenza viruses and positive interactions among noninfluenza viruses. In mathematical simulations that mimic 2-pathogen dynamics, we show that transient immune-mediated interference can cause a relatively ubiquitous common cold-like virus to diminish during peak activity of a seasonal virus, supporting the potential role of innate immunity in driving the asynchronous circulation of influenza A and rhinovirus. These findings have important implications for understanding the linked epidemiological dynamics of viral respiratory infections, an important step towards improved accuracy of disease forecasting models and evaluation of disease control interventions. epidemiology | virology | ecology

show abstract

Power analysis for generalized linear mixed models in ecology and evolution

et al. 2014

View full text Add to dashboard Cite

‘Will my study answer my research question?’ is the most fundamental question a researcher can ask when designing a study, yet when phrased in statistical terms – ‘What is the power of my study?’ or ‘How precise will my parameter estimate be?’ – few researchers in ecology and evolution (EE) try to answer it, despite the detrimental consequences of performing under- or over-powered research. We suggest that this reluctance is due in large part to the unsuitability of simple methods of power analysis (broadly defined as any attempt to quantify prospectively the ‘informativeness’ of a study) for the complex models commonly used in EE research. With the aim of encouraging the use of power analysis, we present simulation from generalized linear mixed models (GLMMs) as a flexible and accessible approach to power analysis that can account for random effects, overdispersion and diverse response distributions.We illustrate the benefits of simulation-based power analysis in two research scenarios: estimating the precision of a survey to estimate tick burdens on grouse chicks and estimating the power of a trial to compare the efficacy of insecticide-treated nets in malaria mosquito control. We provide a freely available R function, sim.glmm, for simulating from GLMMs.Analysis of simulated data revealed that the effects of accounting for realistic levels of random effects and overdispersion on power and precision estimates were substantial, with correspondingly severe implications for study design in the form of up to fivefold increases in sampling effort. We also show the utility of simulations for identifying scenarios where GLMM-fitting methods can perform poorly.These results illustrate the inadequacy of standard analytical power analysis methods and the flexibility of simulation-based power analysis for GLMMs. The wider use of these methods should contribute to improving the quality of study design in EE.

show abstract

Association Between Genetic Variants on Chromosome 15q25 Locus and Objective Measures of Tobacco Exposure

Timofeeva²,

et al. 2012

View full text Add to dashboard Cite

BackgroundTwo single-nucleotide polymorphisms, rs1051730 and rs16969968, located within the nicotinic acetylcholine receptor gene cluster on chromosome 15q25 locus, are associated with heaviness of smoking, risk for lung cancer, and other smoking-related health outcomes. Previous studies have typically relied on self-reported smoking behavior, which may not fully capture interindividual variation in tobacco exposure.MethodsWe investigated the association of rs1051730 and rs16969968 genotype (referred to as rs1051730–rs16969968, because these are in perfect linkage disequilibrium and interchangeable) with both self-reported daily cigarette consumption and biochemically measured plasma or serum cotinine levels among cigarette smokers. Summary estimates and descriptive statistical data for 12 364 subjects were obtained from six independent studies, and 2932 smokers were included in the analyses. Linear regression was used to calculate the per-allele association of rs1051730–rs16969968 genotype with cigarette consumption and cotinine levels in current smokers for each study. Meta-analysis of per-allele associations was conducted using a random effects method. The likely resulting association between genotype and lung cancer risk was assessed using published data on the association between cotinine levels and lung cancer risk. All statistical tests were two-sided.ResultsPooled per-allele associations showed that current smokers with one or two copies of the rs1051730–rs16969968 risk allele had increased self-reported cigarette consumption (mean increase in unadjusted number of cigarettes per day per allele = 1.0 cigarette, 95% confidence interval [CI] = 0.57 to 1.43 cigarettes, P = 5.22 × 10−6) and cotinine levels (mean increase in unadjusted cotinine levels per allele = 138.72 nmol/L, 95% CI = 97.91 to 179.53 nmol/L, P = 2.71 × 10−11). The increase in cotinine levels indicated an increased risk of lung cancer with each additional copy of the rs1051730–rs16969968 risk allele (per-allele odds ratio = 1.31, 95% CI = 1.21 to 1.42).ConclusionsOur data show a stronger association of rs1051730–rs16969968 genotype with objective measures of tobacco exposure compared with self-reported cigarette consumption. The association of these variants with lung cancer risk is likely to be mediated largely, if not wholly, via tobacco exposure.

show abstract

Maximum-Likelihood Estimation of Allelic Dropout and False Allele Error Rates From Microsatellite Genotypes in the Absence of Reference Data

2007

View full text Add to dashboard Cite

The importance of quantifying and accounting for stochastic genotyping errors when analyzing microsatellite data is increasingly being recognized. This awareness is motivating the development of data analysis methods that not only take errors into consideration but also recognize the difference between two distinct classes of error, allelic dropout and false alleles. Currently methods to estimate rates of allelic dropout and false alleles depend upon the availability of error-free reference genotypes or reliable pedigree data, which are often not available. We have developed a maximum-likelihood-based method for estimating these error rates from a single replication of a sample of genotypes. Simulations show it to be both accurate and robust to modest violations of its underlying assumptions. We have applied the method to estimating error rates in two microsatellite data sets. It is implemented in a computer program, Pedant, which estimates allelic dropout and false allele error rates with 95% confidence regions from microsatellite genotype data and performs power analysis. Pedant is freely available at http:/ /www.stats.gla.ac. uk/$paulj/pedant.html.

show abstract

Accelerated Telomere Attrition Is Associated with Relative Household Income, Diet and Inflammation in the pSoBid Cohort

et al. 2011

View full text Add to dashboard Cite

BackgroundIt has previously been hypothesized that lower socio-economic status can accelerate biological ageing, and predispose to early onset of disease. This study investigated the association of socio-economic and lifestyle factors, as well as traditional and novel risk factors, with biological-ageing, as measured by telomere length, in a Glasgow based cohort that included individuals with extreme socio-economic differences.MethodsA total of 382 blood samples from the pSoBid study were available for telomere analysis. For each participant, data was available for socio-economic status factors, biochemical parameters and dietary intake. Statistical analyses were undertaken to investigate the association between telomere lengths and these aforementioned parameters.ResultsThe rate of age-related telomere attrition was significantly associated with low relative income, housing tenure and poor diet. Notably, telomere length was positively associated with LDL and total cholesterol levels, but inversely correlated to circulating IL-6.ConclusionsThese data suggest lower socio-economic status and poor diet are relevant to accelerated biological ageing. They also suggest potential associations between elevated circulating IL-6, a measure known to predict cardiovascular disease and diabetes with biological ageing. These observations require further study to tease out potential mechanistic links.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Paul C. D. Johnson

The coefficient of determination R ² and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded

Extension of Nakagawa & Schielzeth's R²_GLMM to random slopes models

HMG-coenzyme A reductase inhibition, type 2 diabetes, and bodyweight: evidence from genetic analysis and randomised trials

Virus–virus interactions impact the population dynamics of influenza and the common cold

Power analysis for generalized linear mixed models in ecology and evolution

Association Between Genetic Variants on Chromosome 15q25 Locus and Objective Measures of Tobacco Exposure

Maximum-Likelihood Estimation of Allelic Dropout and False Allele Error Rates From Microsatellite Genotypes in the Absence of Reference Data

Accelerated Telomere Attrition Is Associated with Relative Household Income, Diet and Inflammation in the pSoBid Cohort

Contact Info

Product

Resources

About

Paul C. D. Johnson

The coefficient of determination R 2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded

Extension of Nakagawa & Schielzeth's R2GLMM to random slopes models

HMG-coenzyme A reductase inhibition, type 2 diabetes, and bodyweight: evidence from genetic analysis and randomised trials

Virus–virus interactions impact the population dynamics of influenza and the common cold

Power analysis for generalized linear mixed models in ecology and evolution

Association Between Genetic Variants on Chromosome 15q25 Locus and Objective Measures of Tobacco Exposure

Maximum-Likelihood Estimation of Allelic Dropout and False Allele Error Rates From Microsatellite Genotypes in the Absence of Reference Data

Accelerated Telomere Attrition Is Associated with Relative Household Income, Diet and Inflammation in the pSoBid Cohort

Contact Info

Product

Resources

About

The coefficient of determination R ² and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded

Extension of Nakagawa & Schielzeth's R²_GLMM to random slopes models