Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
Alcohol dependence is a heterogeneous psychiatric disorder characterized by high genetic heritability and neuroadaptations occurring from repeated drug exposure. Through an integrated systems approach we observed consistent differences in transcriptome organization within postmortem human brain tissue associated with the lifetime consumption of alcohol. Molecular networks, determined using high-throughput RNA sequencing, for drinking behavior were dominated by neurophysiological targets and signaling mechanisms of alcohol. The systematic structure of gene-sets demonstrates a novel alliance of multiple ion-channels, and related processes, underlying lifetime alcohol consumption. Coordinate expression of these transcripts was enriched for genome-wide association signals in alcohol dependence and a meta-analysis of alcohol self-administration in mice. Further dissection of genes within alcohol consumption networks revealed the potential interaction of alternatively spliced transcripts. For example, expression of a human-specific isoform of the voltage-gated sodium channel subunit SCN4B was significantly correlated to lifetime alcohol consumption. Overall, our work demonstrates novel convergent evidence for biological networks related to excessive alcohol consumption, which may prove fundamentally important in the development of pharmacotherapies for alcohol dependence.
Kaposi's sarcoma-associated herpesvirus (KSHV) is a human tumor virus that encodes 12 precursor microRNAs (pre-miRNAs) that give rise to 17 different known~22-nucleotide (nt) effector miRNAs. Like all herpesviruses, KSHV has two modes of infection: (1) a latent mode whereby only a subset of viral genes are expressed and (2) a lytic mode during which the full remaining viral genes are expressed. To date, KSHV miRNAs have been mostly identified via analysis of cells that are undergoing latent infection. Here, we developed a method to profile small RNAs (~18-75 nt) from populations of cells undergoing predominantly lytic infection. Using two different next-generation sequencing platforms, we cloned and sequenced both premiRNAs and derivative miRNAs. Our analysis shows that the vast majority of viral and host 5p miRNAs are co-terminal with the 59 end of the cloned pre-miRNAs, consistent with both being defined by microprocessor cleavage. We report the complete repertoire (25 total) of 5p and 3p derivative miRNAs from all 12 previously described KSHV pre-miRNAs. Two KSHV premiRNAs, pre-miR-K12-8 and pre-miR-K12-12, encode abundant derivative miRNAs from the previously unreported strands of the pre-miRNA. We identify several novel small RNAs of low abundance, including viral miRNA-offset-RNAs (moRNAs), and antisense viral miRNAs (miRNA-AS) that are encoded antisense to previously reported KSHV pre-miRNAs. Finally, we observe widespread antisense transcription relative to known coding sequences during lytic replication. Despite the enormous potential to form double-stranded RNA in KSHV-infected cells, we observe no evidence for the existence of abundant viral-derived small interfering RNAs (siRNAs).
BackgroundA number of publications have reported the use of microarray technology to identify gene expression signatures to infer mechanisms and pathways associated with systemic lupus erythematosus (SLE) in human peripheral blood mononuclear cells. However, meta-analysis approaches with microarray data have not been well-explored in SLE.MethodsIn this study, a pathway-based meta-analysis was applied to four independent gene expression oligonucleotide microarray data sets to identify gene expression signatures for SLE, and these data sets were confirmed by a fifth independent data set.ResultsDifferentially expressed genes (DEGs) were identified in each data set by comparing expression microarray data from control samples and SLE samples. Using Ingenuity Pathway Analysis software, pathways associated with the DEGs were identified in each of the four data sets. Using the leave one data set out pathway-based meta-analysis approach, a 37-gene metasignature was identified. This SLE metasignature clearly distinguished SLE patients from controls as observed by unsupervised learning methods. The final confirmation of the metasignature was achieved by applying the metasignature to a fifth independent data set.ConclusionsThe novel pathway-based meta-analysis approach proved to be a useful technique for grouping disparate microarray data sets. This technique allowed for validated conclusions to be drawn across four different data sets and confirmed by an independent fifth data set. The metasignature and pathways identified by using this approach may serve as a source for identifying therapeutic targets for SLE and may possibly be used for diagnostic and monitoring purposes. Moreover, the meta-analysis approach provides a simple, intuitive solution for combining disparate microarray data sets to identify a strong metasignature.Please see Research Highlight: http://genomemedicine.com/content/3/5/30
Date palm is a very important crop in western Asia and northern Africa, and it is the oldest domesticated fruit tree with archaeological records dating back 5000 years. The huge economic value of this crop has generated considerable interest in breeding programs to enhance production of dates. One of the major limitations of these efforts is the uncertainty regarding the number of date palm cultivars, which are currently based on fruit shape, size, color, and taste. Whole mitochondrial and plastid genome sequences were utilized to examine single nucleotide polymorphisms (SNPs) of date palms to evaluate the efficacy of this approach for molecular characterization of cultivars. Mitochondrial and plastid genomes of nine Saudi Arabian cultivars were sequenced. For each species about 60 million 100 bp paired-end reads were generated from total genomic DNA using the Illumina HiSeq 2000 platform. For each cultivar, sequences were aligned separately to the published date palm plastid and mitochondrial reference genomes, and SNPs were identified. The results identified cultivar-specific SNPs for eight of the nine cultivars. Two previous SNP analyses of mitochondrial and plastid genomes identified substantial intra-cultivar ( = intra-varietal) polymorphisms in organellar genomes but these studies did not properly take into account the fact that nearly half of the plastid genome has been integrated into the mitochondrial genome. Filtering all sequencing reads that mapped to both organellar genomes nearly eliminated mitochondrial heteroplasmy but all plastid SNPs remained heteroplasmic. This investigation provides valuable insights into how to deal with interorganellar DNA transfer in performing SNP analyses from total genomic DNA. The results confirm recent suggestions that plastid heteroplasmy is much more common than previously thought. Finally, low levels of sequence variation in plastid and mitochondrial genomes argue for using nuclear SNPs for molecular characterization of date palm cultivars.
Alkaloid accumulation in plants is activated in response to stress, is limited in distribution and specific alkaloid repertoires are variable across taxa. Rauvolfioideae (Apocynaceae, Gentianales) represents a major center of structural expansion in the monoterpenoid indole alkaloids (MIAs) yielding thousands of unique molecules including highly valuable chemotherapeutics. The paucity of genome-level data for Apocynaceae precludes a deeper understanding of MIA pathway evolution hindering the elucidation of remaining pathway enzymes and the improvement of MIA availability in planta or in vitro. We sequenced the nuclear genome of Rhazya stricta (Apocynaceae, Rauvolfioideae) and present this high quality assembly in comparison with that of coffee (Rubiaceae, Coffea canephora, Gentianales) and others to investigate the evolution of genome-scale features. The annotated Rhazya genome was used to develop the community resource, RhaCyc, a metabolic pathway database. Gene family trees were constructed to identify homologs of MIA pathway genes and to examine their evolutionary history. We found that, unlike Coffea, the Rhazya lineage has experienced many structural rearrangements. Gene tree analyses suggest recent, lineage-specific expansion and diversification among homologs encoding MIA pathway genes in Gentianales and provide candidate sequences with the potential to close gaps in characterized pathways and support prospecting for new MIA production avenues.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.