We describe statistical methods based on the t test that can be conveniently used on high density array data to test for statistically significant differences between treatments. These t tests employ either the observed variance among replicates within treatments or a Bayesian estimate of the variance among replicates within treatments based on a prior estimate obtained from a local estimate of the standard deviation. The Bayesian prior allows statistical inference to be made from microarray data even when experiments are only replicated at nominal levels. We apply these new statistical tests to a data set that examined differential gene expression patterns in IHF 275, 29672-29684). These analyses identify a more biologically reasonable set of candidate genes than those identified using statistical tests not incorporating a Bayesian prior. We also show that statistical tests based on analysis of variance and a Bayesian prior identify genes that are up-or down-regulated following an experimental manipulation more reliably than approaches based only on a t test or fold change. All the described tests are implemented in a simple-to-use web interface called Cyber-T that is located on the University of California at Irvine genomics web site.
Wnt regulation of gene expression requires binding of LEF/T-cell factor (LEF/TCF) transcription factors toWnt response elements (WREs) and recruitment of the activator -catenin. There are significant differences in the abilities of LEF/TCF family members to regulate Wnt target genes. For example, alternatively spliced isoforms of TCF-1 and TCF-4 with a C-terminal "E" tail are uniquely potent in their activation of LEF1 and CDX1. Here we report that the mechanism responsible for this unique activity is an auxiliary 30-amino-acid DNA interaction motif referred to here as the "cysteine clamp" (or C-clamp). The C-clamp contains invariant cysteine, aromatic, and basic residues, and surface plasmon resonance (SPR) studies with recombinant C-clamp protein showed that it binds double-stranded DNA but not single-stranded DNA or RNA (equilibrium dissociation constant ؍ 16 nM). CASTing (Cyclic Amplification and Selection of Targets) experiments were used to test whether this motif influences WRE recognition. Full-length LEF-1, TCF-1E, and TCF-1E with a mutated C-clamp all bind nearly identical WREs (TYYCTTTGATSTT), showing that the C-clamp does not alter WRE specificity. However, a GC element downstream of the WRE (RCCG) is enriched in wild-type TCF-1E binding sites but not in mutant TCF-1E binding sites. We conclude that the C-clamp is a sequencespecific DNA binding motif. C-clamp mutations destroy the ability of -catenin to regulate the LEF1 promoter, and they severely impair the ability of TCF-1 to regulate growth in colon cancer cells. Thus, E-tail isoforms of TCFs utilize two DNA binding activities to access a subset of Wnt targets important for cell growth.
The anterior pituitary gland provides a model for investigating the molecular basis for the appearance of phenotypically distinct cell types within an organ, a central question in development. The rat prolactin and growth hormone genes are expressed selectively in distinct cell types (lactotrophs and somatotrophs, respectively) of the anterior pituitary gland, reflecting differential mechanisms of gene activation or restriction, as a result of the interactions of multiple factors binding to these genes. We find that when the pituitaryspecific 33-kD transcription factor Pit-l, expressed normally in both lactotrophs and somatotrophs, is expressed in either the heterologous HeLa cell line or in bacteria, it binds to and activates transcription from both growth hormone and prolactin promoters in vitro at levels even 10-fold lower than those normally present in pituitary cells. This suggests that a single factor, Pit-l, may be capable of activating the expression of two genes that define different anterior pituitary cell phenotypes. Because a putative lactotroph cell line (235-1) that does not express the growth hormone gene, but only the prolactin gene, appears to contain high levels of functional Pit-l, a mechanism selectively preventing growth hormone gene expression may, in part, account for the lactotroph phenotype.
Short cis-active sequences of the rat prolactin or Moloney murine leukemia virus genes transfer transcriptional regulation by both epidermal growth factor and phorbol esters to fusion genes. These sequences act in a position- and orientation-independent manner. Competitive binding analyses with nuclear extracts from stimulated and unstimulated cells suggest that different trans-acting factors associate with the regulatory sequence of each gene. A model is proposed suggesting that both epidermal growth factor and phorbol esters stimulate the transcription of responsive genes via discrete classes of hormone-dependent, enhancer-like elements that bind different trans-acting factors, even in the absence of hormone stimulation.
Bioinformatics research is often difficult to do with commercial software. The Open Source BioPerl, BioPython and Biojava projects provide toolkits with multiple functionality that make it easier to create customised pipelines or analysis. This review briefly compares the quirks of the underlying languages and the functionality, documentation, utility and relative advantages of the Bio counterparts, particularly from the point of view of the beginning biologist programmer.
Recent studies show that human-specific LINE1s (L1HS) play a key role in the development of the central nervous system (CNS) and its disorders, and that their transpositions within the human genome are more common than previously thought. Many polymorphic L1HS, that is, present or absent across individuals, are not annotated in the current release of the genome and are customarily termed "non-reference L1s." We developed an analytical workflow to identify L1 polymorphic insertions with next-generation sequencing (NGS) using data from a family in which SZ segregates. Our workflow exploits two independent algorithms to detect nonreference L1 insertions, performs local de novo alignment of the regions harboring predicted L1 insertions and resolves the L1 subfamily designation from the de novo assembled sequence. We found 110 non-reference L1 polymorphic loci exhibiting Mendelian inheritance, the vast majority of which are already reported in dbRIP and/or euL1db, thus, confirming their status as non-reference L1 polymorphic insertions. Four previously undetected L1 polymorphic loci were confirmed by PCR amplification and direct sequencing of the insert. A large fraction of our non-reference L1s is located within the open reading frame of protein-coding genes that belong to pathways already implicated in the pathogenesis of schizophrenia. The finding of these polymorphic variants among SZ offsprings is intriguing and suggestive of putative pathogenic role. Our data show the utility of NGS to uncover L1 polymorphic insertions, a neglected type of genetic variants with the potential to influence the risk to develop schizophrenia like SNVs and CNVs. Ó2016WileyPeriodicals, Inc.
Because gene expression profiles are highly sensitive to sample and processing conditions, it is crucial to accurately represent these conditions along with the numeric data in a way that allows the conditions to be part of a query. The GeneX TM project is intended to provide an Open Source database and integrated tool set that will allow researchers to store and evaluate their gene expression data and, moreover, such evaluation will be independent of the technology used to obtain the data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.