Microarray analysis has become a widely used tool for the generation of gene expression data on a genomic scale. Although many significant results have been derived from microarray studies, one limitation has been the lack of standards for presenting and exchanging such data. Here we present a proposal, the Minimum Information About a Microarray Experiment (MIAME), that describes the minimum information required to ensure that microarray data can be easily interpreted and that results derived from its analysis can be independently verified. The ultimate goal of this work is to establish a standard for recording and reporting microarray-based gene expression data, which will in turn facilitate the establishment of databases and public repositories and enable the development of data analysis tools. With respect to MIAME, we concentrate on defining the content and structure of the necessary information rather than the technical format for capturing it.
The results dramatically expand genomic sampling of the domain Archaea and clarify taxonomic designations within a major superphylum. This study, in combination with recently published work on bacterial phyla lacking cultivated representatives, reveals a fascinating phenomenon of major radiations of organisms with small genomes, novel proteome composition, and strong interdependence in both domains.
Toxoplasma gondii infects up to one third of the world's population. A key to the success of T. gondii as a parasite is its ability to persist for the life of its host as bradyzoites within tissue cysts. The glycosylated cyst wall is the key structural feature that facilitates persistence and oral transmission of this parasite. Because most of the antibodies and reagents that recognize the cyst wall recognize carbohydrates, identification of the components of the cyst wall has been technically challenging. We have identified CST1 (TGME49_064660) as a 250 kDa SRS (SAG1 related sequence) domain protein with a large mucin-like domain. CST1 is responsible for the Dolichos biflorus Agglutinin (DBA) lectin binding characteristic of T. gondii cysts. Deletion of CST1 results in reduced cyst number and a fragile brain cyst phenotype characterized by a thinning and disruption of the underlying region of the cyst wall. These defects are reversed by complementation of CST1. Additional complementation experiments demonstrate that the CST1-mucin domain is necessary for the formation of a normal cyst wall structure, the ability of the cyst to resist mechanical stress, and binding of DBA to the cyst wall. RNA-seq transcriptome analysis demonstrated dysregulation of bradyzoite genes within the various cst1 mutants. These results indicate that CST1 functions as a key structural component that confers essential sturdiness to the T. gondii tissue cyst critical for persistence of bradyzoite forms.
BackgroundBioinformatics researchers are now confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce.DescriptionAn overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.ConclusionsHadoop and the MapReduce programming paradigm already have a substantial base in the bioinformatics community, especially in the field of next-generation sequencing analysis, and such use is increasing. This is due to the cost-effectiveness of Hadoop-based analysis on commodity Linux clusters, and in the cloud via data upload to cloud vendors who have implemented Hadoop/HBase; and due to the effectiveness and ease-of-use of the MapReduce method in parallelization of many data analysis algorithms.
Lignocellulosic biofuels are promising as sustainable alternative fuels, but lignin inhibits access of enzymes to cellulose, and by-products of lignin degradation can be toxic to cells. The fast growth, high efficiency and specificity of enzymes employed in the anaerobic litter deconstruction carried out by tropical soil bacteria make these organisms useful templates for improving biofuel production. The facultative anaerobe Enterobacter lignolyticus SCF1 was initially cultivated from Cloud Forest soils in the Luquillo Experimental Forest in Puerto Rico, based on anaerobic growth on lignin as sole carbon source. The source of the isolate was tropical forest soils that decompose litter rapidly with low and fluctuating redox potentials, where bacteria using oxygen-independent enzymes likely play an important role in decomposition. We have used transcriptomics and proteomics to examine the observed increased growth of SCF1 grown on media amended with lignin compared to unamended growth. Proteomics suggested accelerated xylose uptake and metabolism under lignin-amended growth, with up-regulation of proteins involved in lignin degradation via the 4-hydroxyphenylacetate degradation pathway, catalase/peroxidase enzymes, and the glutathione biosynthesis and glutathione S-transferase (GST) proteins. We also observed increased production of NADH-quinone oxidoreductase, other electron transport chain proteins, and ATP synthase and ATP-binding cassette (ABC) transporters. This suggested the use of lignin as terminal electron acceptor. We detected significant lignin degradation over time by absorbance, and also used metabolomics to demonstrate moderately significant decreased xylose concentrations as well as increased metabolic products acetate and formate in stationary phase in lignin-amended compared to unamended growth conditions. Our data show the advantages of a multi-omics approach toward providing insights as to how lignin may be used in nature by microorganisms coping with poor carbon availability.
Recent advances in experimental methods have provided sufficient data to consider systems as large networks of interconnected components. High-throughput determination of protein-protein interaction networks has led to the observation that topological bottlenecks, proteins defined by high centrality in the network, are enriched in proteins with systems-level phenotypes such as essentiality. Global transcriptional profiling by microarray analysis has been used extensively to characterize systems, for example, examining cellular response to environmental conditions and effects of genetic mutations. These transcriptomic datasets have been used to infer regulatory and functional relationship networks based on co-regulation. We use the context likelihood of relatedness (CLR) method to infer networks from two datasets gathered from the pathogen Salmonella typhimurium: one under a range of environmental culture conditions and the other from deletions of 15 regulators found to be essential in virulence. Bottleneck and hub genes were identified from these inferred networks, and we show for the first time that these genes are significantly more likely to be essential for virulence than their non-bottleneck or non-hub counterparts. Networks generated using simple similarity metrics (correlation and mutual information) did not display this behavior. Overall, this study demonstrates that topology of networks inferred from global transcriptional profiles provides information about the systems-level roles of bottleneck genes. Analysis of the differences between the two CLR-derived networks suggests that the bottleneck nodes are either mediators of transitions between system states or sentinels that reflect the dynamics of these transitions.
The mraZ and mraW genes are highly conserved in bacteria, both in sequence and in their position at the head of the division and cell wall (dcw) gene cluster. Located directly upstream of the mraZ gene, the P mra promoter drives the transcription of mraZ and mraW, as well as many essential cell division and cell wall genes, but no regulator of P mra has been found to date. Although MraZ has structural similarity to the AbrB transition state regulator and the MazE antitoxin and MraW is known to methylate the 16S rRNA, mraZ and mraW null mutants have no detectable phenotypes. Here we show that overproduction of Escherichia coli MraZ inhibited cell division and was lethal in rich medium at high induction levels and in minimal medium at low induction levels. Co-overproduction of MraW suppressed MraZ toxicity, and loss of MraW enhanced MraZ toxicity, suggesting that MraZ and MraW have antagonistic functions. MraZ-green fluorescent protein localized to the nucleoid, suggesting that it binds DNA. Consistent with this idea, purified MraZ directly bound a region of DNA containing three direct repeats between P mra and the mraZ gene. Excess MraZ reduced the expression of an mraZ-lacZ reporter, suggesting that MraZ acts as a repressor of P mra , whereas a DNA-binding mutant form of MraZ failed to repress expression. Transcriptome sequencing (RNA-seq) analysis suggested that MraZ also regulates the expression of genes outside the dcw cluster. In support of this, purified MraZ could directly bind to a putative operator site upstream of mioC, one of the repressed genes identified by RNA-seq.
One purpose of the biomedical literature is to report results in sufficient detail so that the methods of data collection and analysis can be independently replicated and verified. Here we present for consideration a minimum information specification for gene expression localization experiments, called the "Minimum Information Specification For In Situ Hybridization and Immunohistochemistry Experiments (MISFISHIE)". It is modelled after the MIAME (Minimum Information About a Microarray Experiment) specification for microarray experiments. Data specifications like MIAME and MISFISHIE specify the information content without dictating a format for encoding that information. The MISFISHIE specification describes six types of information that should be provided for each experiment: Experimental Design, Biomaterials and Treatments, Reporters, Staining, Imaging Data, and Image Characterizations. This specification has benefited the consortium within which it was initially developed and is expected to benefit the wider research community. We welcome feedback from the scientific community to help improve our proposal.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.