James G. Booth scite author profile

SUMMARY We hypothesized that DNA methylation distributes into specific patterns in cancer cells, which reflect critical biological differences. We therefore examined the methylation profiles of 344 patients with acute myeloid leukemia (AML). Clustering of these patients by methylation data segregated patients into 16 groups. Five of these groups defined new AML subtypes that shared no other known feature. In addition, DNA methylation profiles segregated patients with CEBPA aberrations from other subtypes of leukemia, defined four epigenetically distinct forms of AML with NPM1 mutations, and showed that established AML1-ETO, CBFb-MYH11, and PML-RARA leukemia entities are associated with specific methylation profiles. We report a 15 gene methylation classifier predictive of overall survival in an independent patient cohort (p < 0.001, adjusted for known covariates).

show abstract

Maximizing Generalized Linear Mixed Model Likelihoods With an Automated Monte Carlo EM Algorithm

Booth

Hobert

1999

464

409

View full text Add to dashboard Cite

Two new implementations of the EM algorithm are proposed for maximum likelihood ®tting of generalized linear mixed models. Both methods use random (independent and identically distributed) sampling to construct Monte Carlo approximations at the E-step. One approach involves generating random samples from the exact conditional distribution of the random effects (given the data) by rejection sampling, using the marginal distribution as a candidate. The second method uses a multivariate t importance sampling approximation. In many applications the two methods are complementary. Rejection sampling is more ef®cient when sample sizes are small, whereas importance sampling is better with larger sample sizes. Monte Carlo approximation using random samples allows the Monte Carlo error at each iteration to be assessed by using standard central limit theory combined with Taylor series methods. Speci®cally, we construct a sandwich variance estimate for the maximizer at each approximate E-step. This suggests a rule for automatically increasing the Monte Carlo sample size after iterations in which the true EM step is swamped by Monte Carlo error. In contrast, techniques for assessing Monte Carlo error have not been developed for use with alternative implementations of Monte Carlo EM algorithms utilizing Markov chain Monte Carlo E-step approximations. Three different data sets, including the infamous salamander data of McCullagh and Nelder, are used to illustrate the techniques and to compare them with the alternatives. The results show that the methods proposed can be considerably more ef®cient than those based on Markov chain Monte Carlo algorithms. However, the methods proposed may break down when the intractable integrals in the likelihood function are of high dimension.

show abstract

Open access publishing, article downloads, and citations: randomised controlled trial

Davis

Lewenstein

Simon

et al. 2008

BMJ

331

265

View full text Add to dashboard Cite

Objective To measure the effect of free access to the scientific literature on article downloads and citations. Design Randomised controlled trial. Setting 11 journals published by the American Physiological Society. Participants 1619 research articles and reviews. Main outcome measures Article readership (measured as downloads of full text, PDFs, and abstracts) and number of unique visitors (internet protocol addresses). Citations to articles were gathered from the Institute for Scientific Information after one year. Interventions Random assignment on online publication of articles published in 11 scientific journals to open access (treatment) or subscription access (control). Results Articles assigned to open access were associated with 89% more full text downloads (95% confidence interval 76% to 103%), 42% more PDF downloads (32% to 52%), and 23% more unique visitors (16% to 30%), but 24% fewer abstract downloads (−29% to −19%) than subscription access articles in the first six months after publication. Open access articles were no more likely to be cited than subscription access articles in the first year after publication. Fifty nine per cent of open access articles (146 of 247) were cited nine to 12 months after publication compared with 63% (859 of 1372) of subscription access articles. Logistic and negative binomial regression analysis of article citation counts confirmed no citation advantage for open access articles.

show abstract

A 3D Morphable Model Learnt from 10,000 Faces

et al. 2016

View full text Add to dashboard Cite

We present Large Scale Facial Model (LSFM) -a 3D Morphable Model (3DMM) automatically constructed from 9,663 distinct facial identities. To the best of our knowledge LSFM is the largest-scale Morphable Model ever constructed, containing statistical information from a huge variety of the human population. To build such a large model we introduce a novel fully automated and robust Morphable Model construction pipeline. The dataset that LSFM is trained on includes rich demographic information about each subject, allowing for the construction of not only a global 3DMM but also models tailored for specific age, gender or ethnicity groups. As an application example, we utilise the proposed model to perform age classification from 3D shape alone. Furthermore, we perform a systematic analysis of the constructed 3DMMs that showcases their quality and descriptive power. The presented extensive qualitative and quantitative evaluations reveal that the proposed 3DMM achieves state-of-the-art results, outperforming existing models by a large margin. Finally, for the benefit of the research community, we make publicly available the source code of the proposed automatic 3DMM construction pipeline. In addition, the constructed global 3DMM and a variety of bespoke models tailored by age, gender and ethnicity are available on application to researchers involved in medically oriented research.

show abstract

MUS81 Generates a Subset of MLH1-MLH3–Independent Crossovers in Mammalian Meiosis

et al. 2008

View full text Add to dashboard Cite

Two eukaryotic pathways for processing double-strand breaks (DSBs) as crossovers have been described, one dependent on the MutL homologs Mlh1 and Mlh3, and the other on the structure-specific endonuclease Mus81. Mammalian MUS81 has been implicated in maintenance of genomic stability in somatic cells; however, little is known about its role during meiosis. Mus81-deficient mice were originally reported as being viable and fertile, with normal meiotic progression; however, a more detailed examination of meiotic progression in Mus81-null animals and WT controls reveals significant meiotic defects in the mutants. These include smaller testis size, a depletion of mature epididymal sperm, significantly upregulated accumulation of MLH1 on chromosomes from pachytene meiocytes in an interference-independent fashion, and a subset of meiotic DSBs that fail to be repaired. Interestingly, chiasmata numbers in spermatocytes from Mus81−/− animals are normal, suggesting additional integrated mechanisms controlling the two distinct crossover pathways. This study is the first in-depth analysis of meiotic progression in Mus81-nullizygous mice, and our results implicate the MUS81 pathway as a regulator of crossover frequency and placement in mammals.

show abstract

Large Scale 3D Morphable Models

et al. 2017

View full text Add to dashboard Cite

We present large scale facial model (LSFM)-a 3D Morphable Model (3DMM) automatically constructed from 9663 distinct facial identities. To the best of our knowledge LSFM is the largest-scale Morphable Model ever constructed, containing statistical information from a huge variety of the human population. To build such a large model we introduce a novel fully automated and robust Morphable Model construction pipeline, informed by an evaluation of state-of-the-art dense correspondence techniques. The dataset that LSFM is trained on includes rich demographic information about each subject, allowing for the construction of not only a global 3DMM model but also models tailored for specific age, gender or ethnicity groups. We utilize the proposed model to perform age classification from 3D shape alone and to reconstruct noisy out-of-sample data in the low-dimensional model space. Furthermore, we perform a systematic analysis of the constructed 3DMM models that showcases their quality and descriptive power. reveal that the proposed 3DMM achieves state-of-the-art results, outperforming existing models by a large margin. Finally, for the benefit of the research community, we make publicly available the source code of the proposed automatic 3DMM construction pipeline, as well as the constructed global 3DMM and a variety of bespoke models tailored by age, gender and ethnicity.

show abstract

Standard Errors of Prediction in Generalized Linear Mixed Models

Booth

Hobert

1998

Journal of the American Statistical Association

125

122

View full text Add to dashboard Cite

The unconditional mean squared error of prediction (UMSEP) is widely used as a measure of prediction variance for inferences concerning linear combinations of fixed and random effects in the classical normal theory mixed model. But the UMSEP is inappropriate for generalized linear mixed models where the conditional variance of the random effects depends on the data. When the random effects describe variation between independent small domains and domain-specific prediction is of interest, we propose a conditional mean squared error of prediction (CMSEP) as a general measure of prediction variance. The CMSEP is shown to be the sum of the conditional variance and a positive correction that accounts for the sampling variability of parameter estimates. We derive a second-order-correct estimate of the CMSEP that consists of three components: (a) a plug-in estimate of the conditional variance, (b) a plug-in estimate of a Taylor series approximation to the correction term, and (c) a bootstrap estimate of the bias incurred in (a). In the normal case our formulas based on the CMSEP provide a conditional alternative to the unconditional expansions of Fuller and Harter, Kackar and Harville, and Prasad and Rao. In addition, we show that the prediction variance formula obtained by Wolfinger and O'Connell and suggested by Breslow and Clayton is in fact Laplace's approximation to the CMSEP based on the assumption that the variance components are known and ignoring the bias-correction term. Thus this formula has a conditional interpretation in the small-domain setting and should not be interpreted unconditionally. Finally, although use of the CMSEP is motivated using entirely frequentist arguments, our second-order approximation to the CMSEP closely resembles a corresponding expansion for the Bayesian posterior variance.

show abstract

2. Random-Effects Modeling of Categorical Response Data

Agresti

Booth

Hobert

et al. 2000

Sociological Methodology

142

116

View full text Add to dashboard Cite

In many applications observations have some type of clustering, with observations within clusters tending to be correlated. A common instance of this occurs when each subject in the sample undergoes repeated measurement, in which case a cluster consists of the set of observations for the subject. One approach to modeling clustered data introduces cluster-level random effects into the model. The use of random effects in linear models for normal responses is well established. By contrast, random effects have only recently seen much use in models for categorical data. This chapter surveys a variety of potential social science applications of random effects modeling of categorical data. Applications discussed include repeated measurement for binary or ordinal responses, shrinkage to improve multiparameter estimation of a set of proportions or rates, multivariate latent variable modeling, hierarchically structured modeling, and cluster sampling. The models discussed belong to the class of generalized linear mixed models (GLMMs), an extension of ordinary linear models that permits nonnormal response variables and both fixed and random effects in the predictor term. The models are GLMMs for either binomial or PoisThis work was partially supported by a grant from the National Science Foundation. The authors appreciate comments from Brent Coull, Russ Wolfinger, and two referees. They also thank Jonathan Hartzel for advice on computing and the use of his program for the nonparametric random-effects approach.*University of Florida 27 son response variables, although we also present extensions to multicategory (nominal or ordinal) responses. We also summarize some of the technical issues of model-fitting that complicate the fitting of GLMMs even with existing software.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

James G. Booth

DNA Methylation Signatures Identify Biologically Distinct Subtypes in Acute Myeloid Leukemia

Maximizing Generalized Linear Mixed Model Likelihoods With an Automated Monte Carlo EM Algorithm

Open access publishing, article downloads, and citations: randomised controlled trial

A 3D Morphable Model Learnt from 10,000 Faces

MUS81 Generates a Subset of MLH1-MLH3–Independent Crossovers in Mammalian Meiosis

Large Scale 3D Morphable Models

Standard Errors of Prediction in Generalized Linear Mixed Models

2. Random-Effects Modeling of Categorical Response Data

Contact Info

Product

Resources

About