We have generated a molecular taxonomy of lung carcinoma, the leading cause of cancer death in the United States and worldwide. Using oligonucleotide microarrays, we analyzed mRNA expression levels corresponding to 12,600 transcript sequences in 186 lung tumor samples, including 139 adenocarcinomas resected from the lung. Hierarchical and probabilistic clustering of expression data defined distinct subclasses of lung adenocarcinoma. Among these were tumors with high relative expression of neuroendocrine genes and of type II pneumocyte genes, respectively. Retrospective analysis revealed a less favorable outcome for the adenocarcinomas with neuroendocrine gene expression. The diagnostic potential of expression profiling is emphasized by its ability to discriminate primary lung adenocarcinomas from metastases of extra-pulmonary origin. These results suggest that integration of expression profile data with clinical parameters could aid in diagnosis of lung cancer patients.
Recent advances in cDNA and oligonucleotide DNA arrays have made it possible to measure the abundance of mRNA transcripts for many genes simultaneously. The analysis of such experiments is nontrivial because of large data size and many levels of variation introduced at different stages of the experiments. The analysis is further complicated by the large differences that may exist among different probes used to interrogate the same gene. However, an attractive feature of high-density oligonucleotide arrays such as those produced by photolithography and inkjet technology is the standardization of chip manufacturing and hybridization process. As a result, probe-specific biases, although significant, are highly reproducible and predictable, and their adverse effect can be reduced by proper modeling and analysis methods. Here, we propose a statistical model for the probe-level data, and develop model-based estimates for gene expression indexes. We also present model-based methods for identifying and handling crosshybridizing probes and contaminating array regions. Applications of these results will be presented elsewhere.O ligonucleotide expression array technology (1) has recently been adopted in many areas of biomedical research. As reviewed in ref. 2, 14 to 20 probe pairs are used to interrogate each gene, each probe pair has a Perfect Match (PM) and Mismatch (MM) signal, and the average of the PM-MM differences for all probe pairs in a probe set (called ''average difference'') is used as an expression index for the target gene. Researchers rely on the average differences as the starting point for ''high-level analysis'' such as SOM analysis (3) or two way clustering (4). Besides the original publications by Affymetrix scientists (1, 5), there have been very few studies on important ''low-level'' analysis issues such as feature extraction, normalization, and computation of expression indexes (6).One of the most critical issues is the way probe-specific effects are handled. We have found that even after making use of the control information provide by the MM intensity, the information on expression level provided by the different probes for the same gene are still highly variable. We use a set of 21 HuGeneFL arrays to illustrate our discussion. This data set is typical, in terms of quality and sample size, of a data set from a single-laboratory experiment. We have applied the methodology to many sets of arrays from different laboratories and obtained similar results. Each of these 21 arrays contains more than 250,000 features and 7,129 probe sets. Figs. 1 and 2 show data for one probe set in the first six arrays. This probe set (no. 6,457) will be called probe set A hereafter. There are considerable differences in the expression levels of this gene in the samples being interrogated, as the between-array variation in PM-MM differences is substantial. More noteworthy is the dramatic variation among the PM-MM differences of the 20 probes that interrogate the transcript level. ANOVA of the PM-MM differences of this pro...
Many mammalian peripheral tissues have circadian clocks; endogenous oscillators that generate transcriptional rhythms thought to be important for the daily timing of physiological processes. The extent of circadian gene regulation in peripheral tissues is unclear, and to what degree circadian regulation in different tissues involves common or specialized pathways is unknown. Here we report a comparative analysis of circadian gene expression in vivo in mouse liver and heart using oligonucleotide arrays representing 12,488 genes. We find that peripheral circadian gene regulation is extensive (> or = 8-10% of the genes expressed in each tissue), that the distributions of circadian phases in the two tissues are markedly different, and that very few genes show circadian regulation in both tissues. This specificity of circadian regulation cannot be accounted for by tissue-specific gene expression. Despite this divergence, the clock-regulated genes in liver and heart participate in overlapping, extremely diverse processes. A core set of 37 genes with similar circadian regulation in both tissues includes candidates for new clock genes and output genes, and it contains genes responsive to circulating factors with circadian or diurnal rhythms.
Critical injury in humans induces a genomic storm with simultaneous changes in expression of innate and adaptive immunity genes.
Neuropathological and brain imaging studies suggest that schizophrenia may result from neurodevelopmental defects. Cytoarchitectural studies indicate cellular abnormalities suggestive of a disruption in neuronal connectivity in schizophrenia, particularly in the dorsolateral prefrontal cortex. Yet, the molecular mechanisms underlying these findings remain unclear. To identify molecular substrates associated with schizophrenia, DNA microarray analysis was used to assay gene expression levels in postmortem dorsolateral prefrontal cortex of schizophrenic and control patients. Genes determined to have altered expression levels in schizophrenics relative to controls are involved in a number of biological processes, including synaptic plasticity, neuronal development, neurotransmission, and signal transduction. Most notable was the differential expression of myelination-related genes suggesting a disruption in oligodendrocyte function in schizophrenia.
CisGenome is a software system for analyzing genome-wide chromatin immunoprecipitation (ChIP) data. It is designed to meet all basic needs of ChIP data analyses, including visualization, data normalization, peak detection, false discovery rate (FDR) computation, gene-peak association, and sequence and motif analysis. In addition to implementing previously published ChIP-chip analysis methods, the software contains new statistical methods designed specifically for ChIP-seq data. CisGenome has a modular design so that it supports interactive analyses through a graphic user interface as well as customized batch-mode computation for advanced data mining. A built-in browser allows visualization of array images, signals, gene structure, conservation, and DNA sequence and motif information. We illustrate the use of these tools by a comparative analysis of ChIP-chip and ChIP-seq data for the transcription factor NRSF/REST, a study of ChIP-seq analysis without negative control sample, and an analysis of a novel motif in Nanog- and Sox2-binding regions.
The vertebrate retina is comprised of seven major cell types that are generated in overlapping but well-defined intervals. To identify genes that might regulate retinal development, gene expression in the developing retina was profiled at multiple time points using serial analysis of gene expression (SAGE). The expression patterns of 1,051 genes that showed developmentally dynamic expression by SAGE were investigated using in situ hybridization. A molecular atlas of gene expression in the developing and mature retina was thereby constructed, along with a taxonomic classification of developmental gene expression patterns. Genes were identified that label both temporal and spatial subsets of mitotic progenitor cells. For each developing and mature major retinal cell type, genes selectively expressed in that cell type were identified. The gene expression profiles of retinal Müller glia and mitotic progenitor cells were found to be highly similar, suggesting that Müller glia might serve to produce multiple retinal cell types under the right conditions. In addition, multiple transcripts that were evolutionarily conserved that did not appear to encode open reading frames of more than 100 amino acids in length (“noncoding RNAs”) were found to be dynamically and specifically expressed in developing and mature retinal cell types. Finally, many photoreceptor-enriched genes that mapped to chromosomal intervals containing retinal disease genes were identified. These data serve as a starting point for functional investigations of the roles of these genes in retinal development and physiology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.