The BioMart Community Portal (www.biomart.org) is a community-driven effort to provide a unified interface to biomedical databases that are distributed worldwide. The portal provides access to numerous database projects supported by 30 scientific organizations. It includes over 800 different biological datasets spanning genomics, proteomics, model organisms, cancer data, ontology information and more. All resources available through the portal are independently administered and funded by their host organizations. The BioMart data federation technology provides a unified interface to all the available data. The latest version of the portal comes with many new databases that have been created by our ever-growing community. It also comes with better support and extensibility for data analysis and visualization tools. A new addition to our toolbox, the enrichment analysis tool is now accessible through graphical and web service interface. The BioMart community portal averages over one million requests per day. Building on this level of service and the wealth of information that has become available, the BioMart Community Portal has introduced a new, more scalable and cheaper alternative to the large data stores maintained by specialized organizations.
We designed a high-density mouse genotyping array containing 623,124 SNPs that capture the known genetic variation present in the laboratory mouse. The array also contains 916,269 invariant genomic probes that are targeted to functional elements and regions known to harbor segmental duplications. The array opens the door to the characterization of genetic diversity, copy number variation, allele specific gene expression and DNA methylation and will extend the successes of human genome-wide association studies to the mouse.
Gene expression and processing during mouse male germ cell maturation (spermatogenesis) is highly specialized. Previous reports have suggested that there is a high incidence of alternative 3′-processing in male germ cell mRNAs, including reduced usage of the canonical polyadenylation signal, AAUAAA. We used EST libraries generated from mouse testicular cells to identify 3′-processing sites used at various stages of spermatogenesis (spermatogonia, spermatocytes and round spermatids) and testicular somatic Sertoli cells. We assessed differences in 3′-processing characteristics in the testicular samples, compared to control sets of widely used 3′-processing sites. Using a new method for comparison of degenerate regulatory elements between sequence samples, we identified significant changes in the use of putative 3′-processing regulatory sequence elements in all spermatogenic cell types. In addition, we observed a trend towards truncated 3′-untranslated regions (3′-UTRs), with the most significant differences apparent in round spermatids. In contrast, Sertoli cells displayed a much smaller trend towards 3′-UTR truncation and no significant difference in 3′-processing regulatory sequences. Finally, we identified a number of genes encoding mRNAs that were specifically subject to alternative 3′-processing during meiosis and postmeiotic development. Our results highlight developmental differences in polyadenylation site choice and in the elements that likely control them during spermatogenesis.
Motivation: Cis-acting regulatory elements are frequently constrained by both sequence content and positioning relative to a functional site, such as a splice or polyadenylation site. We describe an approach to regulatory motif analysis based on non-negative matrix factorization (NMF). Whereas existing pattern recognition algorithms commonly focus primarily on sequence content, our method simultaneously characterizes both positioning and sequence content of putative motifs.Results: Tests on artificially generated sequences show that NMF can faithfully reproduce both positioning and content of test motifs. We show how the variation of the residual sum of squares can be used to give a robust estimate of the number of motifs or patterns in a sequence set. Our analysis distinguishes multiple motifs with significant overlap in sequence content and/or positioning. Finally, we demonstrate the use of the NMF approach through characterization of biologically interesting datasets. Specifically, an analysis of mRNA 3′-processing (cleavage and polyadenylation) sites from a broad range of higher eukaryotes reveals a conserved core pattern of three elements.Contact: joel.graber@jax.orgSupplementary information: Supplementary data are available at Bioinformatics online.
We have created a high-density SNP resource encompassing 7.87 million polymorphic loci across 49 inbred mouse strains of the laboratory mouse by combining data available from public databases and training a hidden Markov model to impute missing genotypes in the combined data. The strong linkage disequilibrium found in dense sets of SNP markers in the laboratory mouse provides the basis for accurate imputation. Using genotypes from eight independent SNP resources, we empirically validated the quality of the imputed genotypes and demonstrate that they are highly reliable for most inbred strains. The imputed SNP resource will be useful for studies of natural variation and complex traits. It will facilitate association study designs by providing high density SNP genotypes for large numbers of mouse strains. We anticipate that this resource will continue to evolve as new genotype data become available for laboratory mouse strains. The data are available for bulk download or query at http://cgd.jax.org/.
The Gene Expression Database (GXD; www.informatics.jax.org/expression.shtml) is an extensive and well-curated community resource of mouse developmental expression information. Through curation of the scientific literature and by collaborations with large-scale expression projects, GXD collects and integrates data from RNA in situ hybridization, immunohistochemistry, RT-PCR, northern blot and western blot experiments. Expression data from both wild-type and mutant mice are included. The expression data are combined with genetic and phenotypic data in Mouse Genome Informatics (MGI) and made readily accessible to many types of database searches. At present, GXD includes over 1.5 million expression results and more than 300 000 images, all annotated with detailed and standardized metadata. Since our last report in 2014, we have added a large amount of data, we have enhanced data and database infrastructure, and we have implemented many new search and display features. Interface enhancements include: a new Mouse Developmental Anatomy Browser; interactive tissue-by-developmental stage and tissue-by-gene matrix views; capabilities to filter and sort expression data summaries; a batch search utility; gene-based expression overviews; and links to expression data from other species.
The zebrafish has recently emerged as a model system for investigating the developmental roles of glucocorticoid signaling and the mechanisms underlying glucocorticoid-induced developmental programming. To assess the role of the Glucocorticoid Receptor (GR) in such programming, we used CRISPR-Cas9 to produce a new frameshift mutation, GR 369-, which eliminates all potential in-frame initiation codons upstream of the DNA binding domain. Using RNA-seq to ask how this mutation affects the larval transcriptome under both normal conditions and with chronic cortisol treatment, we find that GR mediates most of the effects of the treatment, and paradoxically, that the transcriptome of cortisol-treated larvae is more like that of larvae lacking a GR than that of larvae with a GR, suggesting that the cortisol-treated larvae develop GR resistance. The one transcriptional regulator that was both underexpressed in GR 369larvae and consistently overexpressed in cortisol-treated larvae was klf9. We therefore used CRISPR-Cas9-mediated mutation of klf9 and RNA-seq to assess Klf9-dependent gene expression in both normal and cortisol-treated larvae. Our results indicate that Klf9 contributes significantly to the transcriptomic response to chronic cortisol exposure, mediating the upregulation of proinflammatory genes that we reported previously. The vertebrate hypothalamus-pituitary-adrenal (HPA) axis orchestrates physiological, behavioral, and metabolic adjustments required for homeostasis, by dynamically regulating production and secretion of adrenal steroids known as glucocorticoids. In humans the primary glucocorticoid is cortisol, the biological activity of which is mediated by two regulatory proteins in the nuclear receptor family, the ubiquitous glucocorticoid receptor (GR) and the more tissue-restricted mineralocorticoid receptor (MR). The GR binds cortisol less avidly than the MR and is thus more dynamically regulated over the normal physiological range of cortisol fluctuations 1,2. The GR and MR function both as transcription factors and as non-nuclear signaling proteins, including in the central nervous system where both proteins are highly expressed 1-5. Given that the GR is more widely expressed and more dynamically regulated by cortisol, it is generally thought to be the principal mediator of cortisolinduced genomic responses to circadian rhythms and acute stress 5. An important question for understanding GR function is what downstream transcriptional regulatory genes does it regulate, and to what end? Answering this question is not only important for understanding the physiological function and regulation of the GR, but also for deciphering the gene regulatory networks that orchestrate adaptive developmental programming in response to chronic glucocorticoid exposure such as occurs with chronic early life stress 6. The zebrafish has recently emerged as a model system well-suited to investigating the developmental functions of glucocorticoid signaling and mechanisms underlying stress-induced developmental programming 7-...
Many mRNAs in Caenorhabditis elegans are generated through a trans-splicing reaction that adds one of two classes of spliced leader RNA to an independently transcribed pre-mRNA. SL1 leaders are spliced mostly to pre-mRNAs from genes with outrons, intron-like sequences at the 59-ends of the pre-mRNAs. In contrast, SL2 leaders are nearly exclusively trans-spliced to genes that occur downstream in polycistronic pre-mRNAs produced from operons. Operon pre-mRNA processing requires separation into individual transcripts, which is accomplished by 39-processing of upstream genes and spliced leader trans-splicing to the downstream genes. We used a novel computational analysis, based on nonnegative matrix factorization, to identify and characterize significant differences in the cis-acting sequence elements that differentiate various types of functional site, including internal versus terminal 39-processing sites, and SL1 versus SL2 trans-splicing sites. We describe several key elements, including the U-rich (Ur) element that couples 39-processing with SL2 trans-splicing, and a novel outron (Ou) element that occurs upstream of SL1 trans-splicing sites. Finally, we present models of the distinct classes of trans-splicing reaction, including SL1 trans-splicing at the outron, SL2 trans-splicing in standard operons, competitive SL1-SL2 trans-splicing in operons with large intergenic separation, and SL1 trans-splicing in SL1-type operons, which have no intergenic separation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.