Sean Davis scite author profile

The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.

show abstract

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project

Birney¹,

Stamatoyannopoulos²,

Dutta³

et al. 2007

Nature

4,516

2,393

View full text Add to dashboard Cite

We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

show abstract

Orchestrating high-throughput genomic analysis with Bioconductor

Huber

Carey

Gentleman³

et al. 2015

Nat Methods

3,026

2,327

View full text Add to dashboard Cite

Bioconductor is an open-source, open-development software project for the analysis and comprehension of high-throughput data in genomics and molecular biology. The project aims to enable interdisciplinary research, collaboration and rapid development of scientific software. Based on the statistical programming language R, Bioconductor comprises 934 interoperable packages contributed by a large, diverse community of scientists. Packages cover a range of bioinformatic and statistical applications. They undergo formal initial review and continuous automated testing. We present an overview for prospective users and contributors.

show abstract

BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis

Durinck¹,

Moreau²,

Kasprzyk³

et al. 2005

Bioinformatics

1,781

1,438

View full text Add to dashboard Cite

biomaRt is a new Bioconductor package that integrates BioMart data resources with data analysis software in Bioconductor. It can annotate a wide range of gene or gene product identifiers (e.g. Entrez-Gene and Affymetrix probe identifiers) with information such as gene symbol, chromosomal coordinates, Gene Ontology and OMIM annotation. Furthermore biomaRt enables retrieval of genomic sequences and single nucleotide polymorphism information, which can be used in data analysis. Fast and up-to-date data retrieval is possible as the package executes direct SQL queries to the BioMart databases (e.g. Ensembl). The biomaRt package provides a tight integration of large, public or locally installed BioMart databases with data analysis in Bioconductor creating a powerful environment for biological data mining.

show abstract

Rare Structural Variants Disrupt Multiple Genes in Neurodevelopmental Pathways in Schizophrenia

Walsh

McClellan

McCarthy

et al. 2008

Science

1,636

1,315

View full text Add to dashboard Cite

Schizophrenia is a devastating neurodevelopmental disorder whose genetic influences remain elusive. We hypothesize that individually rare structural variants contribute to the illness. Microdeletions and microduplications >100 kilobases were identified by microarray comparative genomic hybridization of genomic DNA from 150 individuals with schizophrenia and 268 ancestry-matched controls. All variants were validated by high-resolution platforms. Novel deletions and duplications of genes were present in 5% of controls versus 15% of cases and 20% of young-onset cases, both highly significant differences. The association was independently replicated in patients with childhood-onset schizophrenia as compared with their parents. Mutations in cases disrupted genes disproportionately from signaling networks controlling neurodevelopment, including neuregulin and glutamate pathways. These results suggest that multiple, individually rare mutations altering genes in neurodevelopmental pathways contribute to schizophrenia.

show abstract

High-Resolution Mapping and Characterization of Open Chromatin across the Genome

Boyle

Davis

Shulha

et al. 2008

Cell

1,261

1,087

View full text Add to dashboard Cite

Mapping DNase I hypersensitive (HS) sites is an accurate method of identifying the location of genetic regulatory elements, including promoters, enhancers, silencers, insulators, and locus control regions. We employed high-throughput sequencing and whole-genome tiled array strategies to identify DNase I HS sites within human primary CD4+ T cells. Combining these two technologies, we have created a comprehensive and accurate genome-wide open chromatin map. Surprisingly, only 16%-21% of the identified 94,925 DNase I HS sites are found in promoters or first exons of known genes, but nearly half of the most open sites are in these regions. In conjunction with expression, motif, and chromatin immunoprecipitation data, we find evidence of cell-type-specific characteristics, including the ability to identify transcription start sites and locations of different chromatin marks utilized in these cells. In addition, and unexpectedly, our analyses have uncovered detailed features of nucleosome structure.

show abstract

A Single IGF1 Allele Is a Major Determinant of Small Size in Dogs

Sutter

Bustamante

Chase

et al. 2007

Science

570

494

View full text Add to dashboard Cite

The domestic dog exhibits greater diversity in body size than any other terrestrial vertebrate. We used a strategy that exploits the breed structure of dogs to investigate the genetic basis of size. First, through a genome-wide scan, we identified a major quantitative trait locus (QTL) on chromosome 15 influencing size variation within a single breed. Second, we examined genetic variation in the 15-megabase interval surrounding the QTL in small and giant breeds and found marked evidence for a selective sweep spanning a single gene (IGF1), encoding insulin-like growth factor 1. A single IGF1 single-nucleotide polymorphism haplotype is common to all small breeds and nearly absent from giant breeds, suggesting that the same causal sequence variant is a major contributor to body size in all small dogs.

show abstract

Exome sequencing identifies GRIN2A as frequently mutated in melanoma

et al. 2011

View full text Add to dashboard Cite

The incidence of melanoma is increasing more than any other cancer, and knowledge of its genetic alterations is limited. To systematically analyze such alterations, we performed whole-exome sequencing of 14 matched normal and metastatic tumor DNAs. Using stringent criteria, we identified 68 genes that appeared to be somatically mutated at elevated frequency, many of which are not known to be genetically altered in tumors. Most importantly, we discovered that TRRAP harbored a recurrent mutation that clustered in one position (p. Ser722Phe) in 6 out of 67 affected individuals (~4%), as well as a previously unidentified gene, GRIN2A, which was mutated in 33% of melanoma samples. The nature, pattern and functional evaluation of the TRRAP recurrent mutation suggest that TRRAP functions as an oncogene. Our study provides, to our knowledge, the most comprehensive map of genetic alterations in melanoma to date and suggests that the glutamate signaling pathway is involved in this disease.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sean Davis

NCBI GEO: archive for functional genomics data sets—update

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project

Orchestrating high-throughput genomic analysis with Bioconductor

BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis

Rare Structural Variants Disrupt Multiple Genes in Neurodevelopmental Pathways in Schizophrenia

High-Resolution Mapping and Characterization of Open Chromatin across the Genome

A Single IGF1 Allele Is a Major Determinant of Small Size in Dogs

Exome sequencing identifies GRIN2A as frequently mutated in melanoma

Contact Info

Product

Resources

About