Anthony D. Long scite author profile

We develop a Bayesian probabilistic framework for microarray data analysis. At the simplest level, we model log-expression values by independent normal distributions, parameterized by corresponding means and variances with hierarchical prior distributions. We derive point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes. An additional hyperparameter, inversely related to the number of empirical observations, determines the strength of the background variance. Simulations show that these point estimates, combined with a t -test, provide a systematic inference approach that compares favorably with simple t -test or fold methods, and partly compensate for the lack of replication.

show abstract

The Molecular Diversity of Adaptive Convergence

Tenaillon

Rodríguez‐Verdugo

Gaut

et al. 2012

Science

694

788

View full text Add to dashboard Cite

To estimate the number and diversity of beneficial mutations, we experimentally evolved 115 populations of Escherichia coli to 42.2°C for 2000 generations and sequenced one genome from each population. We identified 1331 total mutations, affecting more than 600 different sites. Few mutations were shared among replicates, but a strong pattern of convergence emerged at the level of genes, operons, and functional complexes. Our experiment uncovered a set of primary functional targets of high temperature, but we estimate that many other beneficial mutations could contribute to similar adaptive outcomes. We inferred the pervasive presence of epistasis among beneficial mutations, which shaped adaptive trajectories into at least two distinct pathways involving mutations either in the RNA polymerase complex or the termination factor rho.

show abstract

Patterns of DNA sequence polymorphism along chromosome 1 of maize ( Zea mays ssp. mays L.)

Tenaillon

Sawkins

Long

et al. 2001

Proc. Natl. Acad. Sci. U.S.A.

654

519

View full text Add to dashboard Cite

We measured sequence diversity in 21 loci distributed along chromosome 1 of maize (Zea mays ssp. mays L.). For each locus, we sequenced a common sample of 25 individuals representing 16 exotic landraces and nine U.S. inbred lines. The data indicated that maize has an average of one single nucleotide polymorphism (SNP) every 104 bp between two randomly sampled sequences, a level of diversity higher than that of either humans or Drosophila melanogaster. A comparison of genetic diversity between the landrace and inbred samples showed that inbreds retained 77% of the level of diversity of landraces, on average. In addition, Tajima's D values suggest that the frequency distribution of polymorphisms in inbreds was skewed toward fewer rare variants. Tests for selection were applied to all loci, and deviations from neutrality were detected in three loci. Sequence diversity was heterogeneous among loci, but there was no pattern of diversity along the genetic map of chromosome 1. Nonetheless, diversity was correlated (r ‫؍‬ 0.65) with sequence-based estimates of the recombination rate. Recombination in our sample was sufficient to break down linkage disequilibrium among SNPs. Intragenic linkage disequilibrium declines within 100 -200 bp on average, suggesting that genome-wide surveys for association analyses require SNPs every 100 -200 bp. Single nucleotide polymorphisms (SNPs) are valuable tools for mapping complex phenotypic traits. An SNP either can contribute directly to a phenotype or it can associate with a phenotype as a result of linkage disequilibrium (LD) (1). In either case, it is clear that successful utilization of SNPs requires detailed knowledge of patterns of genetic polymorphism throughout the genome, as well as an understanding of the evolutionary forces shaping those patterns. These forces include genomic factors, such as the distribution of recombination and mutation rates along chromosomes, and evolutionary factors, such as the history of natural selection and population demography (2).Thus far, SNPs have been surveyed extensively for evolutionary purposes in relatively few systems. The surveys have yielded four important observations about DNA sequence diversity. First, diversity varies among species; for example, Drosophila melanogaster (drosophila) is Ϸ8-to 13-fold more diverse at the DNA sequence level than humans (3). Second, the effects of natural selection and demography vary among species. Half of the loci examined in drosophila do not fit the neutral equilibrium model of evolution (4), but only 1 of 16 loci analyzed in humans deviates from the neutral model (2). Third, SNPs provide insights into population history and demography. In humans, for example, African populations contain more genetic diversity than non-African populations, and non-

show abstract

Genome-wide analysis of a long-term evolution experiment with Drosophila

Burke

Dunham

Shahrestani

et al. 2010

Nature

420

544

View full text Add to dashboard Cite

Experimental evolution systems allow the genomic study of adaptation, and so far this has been done primarily in asexual systems with small genomes, such as bacteria and yeast. Here we present whole-genome resequencing data from Drosophila melanogaster populations that have experienced over 600 generations of laboratory selection for accelerated development. Flies in these selected populations develop from egg to adult ∼20% faster than flies of ancestral control populations, and have evolved a number of other correlated phenotypes. On the basis of 688,520 intermediate-frequency, high-quality single nucleotide polymorphisms, we identify several dozen genomic regions that show strong allele frequency differentiation between a pooled sample of five replicate populations selected for accelerated development and pooled controls. On the basis of resequencing data from a single replicate population with accelerated development, as well as single nucleotide polymorphism data from individual flies from each replicate population, we infer little allele frequency differentiation between replicate populations within a selection treatment. Signatures of selection are qualitatively different than what has been observed in asexual species; in our sexual populations, adaptation is not associated with 'classic' sweeps whereby newly arising, unconditionally advantageous mutations become fixed. More parsimonious explanations include 'incomplete' sweep models, in which mutations have not had enough time to fix, and 'soft' sweep models, in which selection acts on pre-existing, common genetic variants. We conclude that, at least for life history characters such as development time, unconditionally advantageous alleles rarely arise, are associated with small net fitness gains or cannot fix because selection coefficients change over time.

show abstract

Genetic dissection of a model complex trait using the Drosophila Synthetic Population Resource

King¹,

Merkes²,

McNeil³

et al. 2012

Genome Res.

200

400

View full text Add to dashboard Cite

Genetic dissection of complex, polygenic trait variation is a key goal of medical and evolutionary genetics. Attempts to identify genetic variants underlying complex traits have been plagued by low mapping resolution in traditional linkage studies, and an inability to identify variants that cumulatively explain the bulk of standing genetic variation in genome-wide association studies (GWAS). Thus, much of the heritability remains unexplained for most complex traits. Here we describe a novel, freely available resource for the Drosophila community consisting of two sets of recombinant inbred lines (RILs), each derived from an advanced generation cross between a different set of eight highly inbred, completely resequenced founders. The Drosophila Synthetic Population Resource (DSPR) has been designed to combine the high mapping resolution offered by multiple generations of recombination, with the high statistical power afforded by a linkage-based design. Here, we detail the properties of the mapping panel of >1600 genotyped RILs, and provide an empirical demonstration of the utility of the approach by genetically dissecting alcohol dehydrogenase (ADH) enzyme activity. We confirm that a large fraction of the variation in this classic quantitative trait is due to allelic variation at the Adh locus, and additionally identify several previously unknown modest-effect trans-acting QTL (quantitative trait loci). Using a unique property of multiparental linkage mapping designs, for each QTL we highlight a relatively small set of candidate causative variants for follow-up work. The DSPR represents an important step toward the ultimate goal of a complete understanding of the genetics of complex traits in the Drosophila model system.

show abstract

Properties and Power of the Drosophila Synthetic Population Resource for the Routine Dissection of Complex Traits

2012

View full text Add to dashboard Cite

The Drosophila Synthetic Population Resource (DSPR) is a newly developed multifounder advanced intercross panel consisting of .1600 recombinant inbred lines (RILs) designed for the genetic dissection of complex traits. Here, we describe the inference of the underlying mosaic founder structure for the full set of RILs from a dense set of semicodominant restriction-siteassociated DNA (RAD) markers and use simulations to explore how variation in marker density and sequencing coverage affects inference. For a given sequencing effort, marker density is more important than sequence coverage per marker in terms of the amount of genetic information we can infer. We also assessed the power of the DSPR by assigning genotypes at a hidden QTL to each RIL on the basis of the inferred founder state and simulating phenotypes for different experimental designs, different genetic architectures, different sample sizes, and QTL of varying effect sizes. We found the DSPR has both high power (e.g., 84% power to detect a 5% QTL) and high mapping resolution (e.g., $1.5 cM for a 5% QTL).T HE ultimate goal of modern genetics is to determine how molecular genetic variation is translated into organismal phenotypes. The vast majority of continuously varying phenotypes are influenced by many genetic variants that often interact with one another and with environmental factors (Falconer and Mackay 1996;Roff 1997;Lynch and Walsh 1998). This underlying complexity has made identifying causative genetic variants for most traits a steep challenge for which the scientific community has only had limited, albeit increasing, success (Mackay 2001;Chanock et al. 2007; Wellcome Trust Case Control Consortium 2007;Mccarthy et al. 2008;Stranger et al. 2011). As a result, there is a large discrepancy between the known heritability of most traits and the fraction of that heritability that can be explained by known causative genetic variants (Manolio et al. 2009;Stranger et al. 2011). This discrepancy has spurred the development of new mapping panels designed to address the shortcomings of existing genome-wide association studies and QTL mapping panels derived from only two parents.The Drosophila Synthetic Population Resource (DSPR) is one such panel (King et al. 2012) similar in concept to other available linkage-based resources: the mouse Collaborative Cross (Churchill et al. 2004;Aylor et al. 2011;Philip et al. 2011), the Arabidopsis multiparent recombinant inbred line population (AMPRIL) (Huang et al. 2011), the Arabidopsis multiparent advanced generation intercross lines (MAGIC) (Kover et al. 2009), and the maize nested associated mapping population (NAM) (Yu et al. 2008;Buckler et al. 2009;Mcmullen et al. 2009;Li et al. 2011). The DSPR is a linkage-based panel that uses a synthetic population approach (Macdonald and Long 2007). To create the DSPR, two separate synthetic populations were created each from a 50-generation intercross of 8 inbred founder lines with one founder line shared between the two populations. From these two synthetic popula...

show abstract

Diversification of complex butterfly wing patterns by repeated regulatory evolution of aWntligand

Martin

Papa

Nadeau

et al. 2012

Proc. Natl. Acad. Sci. U.S.A.

239

271

View full text Add to dashboard Cite

Although animals display a rich variety of shapes and patterns, the genetic changes that explain how complex forms arise are still unclear. Here we take advantage of the extensive diversity of Heliconius butterflies to identify a gene that causes adaptive variation of black wing patterns within and between species. Linkage mapping in two species groups, gene-expression analysis in seven species, and pharmacological treatments all indicate that cis-regulatory evolution of the WntA ligand underpins discrete changes in color pattern features across the Heliconius genus. These results illustrate how the direct modulation of morphogen sources can generate a wide array of unique morphologies, thus providing a link between natural genetic variation, pattern formation, and adaptation.Mü llerian mimicry | Wnt pathway | Mendelian genetics | evolutionary-developmental biology

show abstract

Improved Statistical Inference from DNA Microarray Data Using Analysis of Variance and A Bayesian Statistical Framework

Long¹,

Mangalam

Chan³

et al. 2001

Journal of Biological Chemistry

337

259

View full text Add to dashboard Cite

We describe statistical methods based on the t test that can be conveniently used on high density array data to test for statistically significant differences between treatments. These t tests employ either the observed variance among replicates within treatments or a Bayesian estimate of the variance among replicates within treatments based on a prior estimate obtained from a local estimate of the standard deviation. The Bayesian prior allows statistical inference to be made from microarray data even when experiments are only replicated at nominal levels. We apply these new statistical tests to a data set that examined differential gene expression patterns in IHF 275, 29672-29684). These analyses identify a more biologically reasonable set of candidate genes than those identified using statistical tests not incorporating a Bayesian prior. We also show that statistical tests based on analysis of variance and a Bayesian prior identify genes that are up-or down-regulated following an experimental manipulation more reliably than approaches based only on a t test or fold change. All the described tests are implemented in a simple-to-use web interface called Cyber-T that is located on the University of California at Irvine genomics web site.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anthony D. Long

A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes

The Molecular Diversity of Adaptive Convergence

Patterns of DNA sequence polymorphism along chromosome 1 of maize ( Zea mays ssp. mays L.)

Genome-wide analysis of a long-term evolution experiment with Drosophila

Genetic dissection of a model complex trait using the Drosophila Synthetic Population Resource

Properties and Power of the Drosophila Synthetic Population Resource for the Routine Dissection of Complex Traits

Diversification of complex butterfly wing patterns by repeated regulatory evolution of aWntligand

Improved Statistical Inference from DNA Microarray Data Using Analysis of Variance and A Bayesian Statistical Framework

Contact Info

Product

Resources

About