Shaunna L. Clark scite author profile

Determining sample size requirements for structural equation modeling (SEM) is a challenge often faced by investigators, peer reviewers, and grant writers. Recent years have seen a large increase in SEMs in the behavioral science literature, but consideration of sample size requirements for applied SEMs often relies on outdated rules-of-thumb. This study used Monte Carlo data simulation techniques to evaluate sample size requirements for common applied SEMs. Across a series of simulations, we systematically varied key model properties, including number of indicators and factors, magnitude of factor loadings and path coefficients, and amount of missing data. We investigated how changes in these parameters affected sample size requirements with respect to statistical power, bias in the parameter estimates, and overall solution propriety. Results revealed a range of sample size requirements (i.e., from 30 to 460 cases), meaningful patterns of association between parameters and sample size, and highlight the limitations of commonly cited rules-of-thumb. The broad “lessons learned” for determining SEM sample size requirements are discussed.

show abstract

Models and Strategies for Factor Mixture Analysis: An Example Concerning the Structure Underlying Psychological Disorders

Clark

Muthén

Kaprio

et al. 2013

Structural Equation Modeling: A Multidisciplinary Journal

155

234

View full text Add to dashboard Cite

The factor mixture model (FMM) uses a hybrid of both categorical and continuous latent variables. The FMM is a good model for the underlying structure of psychopathology because the use of both categorical and continuous latent variables allows the structure to be simultaneously categorical and dimensional. This is useful because both diagnostic class membership and the range of severity within and across diagnostic classes can be modeled concurrently. While the conceptualization of the FMM has been explained in the literature, the use of the FMM is still not prevalent. One reason is that there is little research about how such models should be applied in practice and, once a well fitting model is obtained, how it should be interpreted. In this paper, the FMM will be explored by studying a real data example on conduct disorder. By exploring this example, this paper aims to explain the different formulations of the FMM, the various steps in building a FMM, as well as how to decide between a FMM and alternative models.

show abstract

Family and social risk, and parental investments during the early childhood years as predictors of low-income children's school readiness outcomes

Mistry

Benner

Biesanz

et al. 2010

Early Childhood Research Quarterly

214

191

View full text Add to dashboard Cite

Epigenetic Aging in Major Depressive Disorder

Han

Aghajani

Clark

et al. 2018

AJP

174

137

View full text Add to dashboard Cite

show abstract

High density methylation QTL analysis in human blood via next-generation sequencing of the methylated genomic DNA fraction

et al. 2015

View full text Add to dashboard Cite

BackgroundGenetic influence on DNA methylation is potentially an important mechanism affecting individual differences in humans. We use next-generation sequencing to assay blood DNA methylation at approximately 4.5 million loci, each comprising 2.9 CpGs on average, in 697 normal subjects. Methylation measures at each locus are tested for association with approximately 4.5 million single nucleotide polymorphisms (SNPs) to exhaustively screen for methylation quantitative trait loci (meQTLs).ResultsUsing stringent false discovery rate control, 15 % of methylation sites show genetic influence. Most meQTLs are local, where the associated SNP and methylation site are in close genomic proximity. Distant meQTLs and those spanning different chromosomes are less common. Most local meQTLs encompass common SNPs that alter CpG sites (CpG-SNPs). Local meQTLs encompassing CpG-SNPs are enriched in regions of inactive chromatin in blood cells. In contrast, local meQTLs lacking CpG-SNPs are enriched in regions of active chromatin and transcription factor binding sites. Of 393 local meQTLs that overlap disease-associated regions from genome-wide studies, a high percentage encompass common CpG-SNPs. These meQTLs overlap active enhancers, differentiating them from CpG-SNP meQTLs in inactive chromatin.ConclusionsGenetic influence on the human blood methylome is common, involves several heterogeneous processes and is predominantly dependent on local sequence context at the meQTL site. Most meQTLs involve CpG-SNPs, while sequence-dependent effects on chromatin binding are also important in regions of active chromatin. An abundance of local meQTLs resulting from methylation of CpG-SNPs in inactive chromatin suggests that many meQTLs lack functional consequence. Integrating meQTL and Roadmap Epigenomics data could assist fine-mapping efforts.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-015-0842-7) contains supplementary material, which is available to authorized users.

show abstract

Methylome-Wide Association Study of Schizophrenia

et al. 2014

View full text Add to dashboard Cite

A methylome-wide study of aging using massively parallel sequencing of the methyl-CpG-enriched genomic fraction from blood in over 700 subjects

McClay¹,

Åberg²,

Clark³

et al. 2013

147

View full text Add to dashboard Cite

The central importance of epigenetics to the aging process is increasingly being recognized. Here we perform a methylome-wide association study (MWAS) of aging in whole blood DNA from 718 individuals, aged 25 -92 years (mean 5 55). We sequenced the methyl-CpG-enriched genomic DNA fraction, averaging 67.3 million reads per subject, to obtain methylation measurements for the ∼27 million autosomal CpGs in the human genome. Following extensive quality control, we adaptively combined methylation measures for neighboring, highly-correlated CpGs into 4 344 016 CpG blocks with which we performed association testing. Eleven age-associated differentially methylated regions (DMRs) passed Bonferroni correction (P-value < 1.15 3 10 28 ). Top findings replicated in an independent sample set of 558 subjects using pyrosequencing of bisulfite-converted DNA (min P-value < 10 230 ). To examine biological themes, we selected 70 DMRs with false discovery rate of <0.1. Of these, 42 showed hypomethylation and 28 showed hypermethylation with age. Hypermethylated DMRs were more likely to overlap with CpG islands and shores. Hypomethylated DMRs were more likely to be in regions associated with polycomb/regulatory proteins (e.g. EZH2) or histone modifications H3K27ac, H3K4m1, H3K4m2, H3K4m3 and H3K9ac. Among genes implicated by the top DMRs were protocadherins, homeobox genes, MAPKs and ryanodine receptors. Several of our DMRs are at genes with potential relevance for age-related disease. This study successfully demonstrates the application of next-generation sequencing to MWAS, by interrogating a large proportion of the methylome and returning potentially novel age DMRs, in addition to replicating several loci implicated in previous studies using microarrays.

show abstract

Large-Scale Gene-Centric Analysis Identifies Novel Variants for Coronary Artery Disease

Butterworth¹,

Farrall²,

Hardwick³

et al. 2011

PLoS Genet

195

View full text Add to dashboard Cite

Coronary artery disease (CAD) has a significant genetic contribution that is incompletely characterized. To complement genome-wide association (GWA) studies, we conducted a large and systematic candidate gene study of CAD susceptibility, including analysis of many uncommon and functional variants. We examined 49,094 genetic variants in ∼2,100 genes of cardiovascular relevance, using a customised gene array in 15,596 CAD cases and 34,992 controls (11,202 cases and 30,733 controls of European descent; 4,394 cases and 4,259 controls of South Asian origin). We attempted to replicate putative novel associations in an additional 17,121 CAD cases and 40,473 controls. Potential mechanisms through which the novel variants could affect CAD risk were explored through association tests with vascular risk factors and gene expression. We confirmed associations of several previously known CAD susceptibility loci (eg, 9p21.3:p<10−33; LPA:p<10−19; 1p13.3:p<10−17) as well as three recently discovered loci (COL4A1/COL4A2, ZC3HC1, CYP17A1:p<5×10−7). However, we found essentially null results for most previously suggested CAD candidate genes. In our replication study of 24 promising common variants, we identified novel associations of variants in or near LIPA, IL5, TRIB1, and ABCG5/ABCG8, with per-allele odds ratios for CAD risk with each of the novel variants ranging from 1.06–1.09. Associations with variants at LIPA, TRIB1, and ABCG5/ABCG8 were supported by gene expression data or effects on lipid levels. Apart from the previously reported variants in LPA, none of the other ∼4,500 low frequency and functional variants showed a strong effect. Associations in South Asians did not differ appreciably from those in Europeans, except for 9p21.3 (per-allele odds ratio: 1.14 versus 1.27 respectively; P for heterogeneity = 0.003). This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and clarified the literature with regard to many previously suggested genes.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shaunna L. Clark

Sample Size Requirements for Structural Equation Models

Models and Strategies for Factor Mixture Analysis: An Example Concerning the Structure Underlying Psychological Disorders

Family and social risk, and parental investments during the early childhood years as predictors of low-income children's school readiness outcomes

Epigenetic Aging in Major Depressive Disorder

High density methylation QTL analysis in human blood via next-generation sequencing of the methylated genomic DNA fraction

Methylome-Wide Association Study of Schizophrenia

A methylome-wide study of aging using massively parallel sequencing of the methyl-CpG-enriched genomic fraction from blood in over 700 subjects

Large-Scale Gene-Centric Analysis Identifies Novel Variants for Coronary Artery Disease

Contact Info

Product

Resources

About