The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55 000 organisms (>4800 viruses, >40 000 prokaryotes and >10 000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.
Adult cancers may derive from stem or early progenitor cells 1,2 . Epigenetic modulation of gene expression is essential for normal function of these early cells, but is highly abnormal in cancers, which often exhibit aberrant promoter CpG island hypermethylation and transcriptional silencing of tumor suppressor genes and pro-differentiation factors [3][4][5] . We find that, for such genes, both normal and malignant embryonic cells generally lack the gene DNA hypermethylation found in adult cancers. In embryonic stem (ES) cells, these genes are held in a "transcription ready" state mediated by a "bivalent" promoter chromatin pattern consisting of the repressive polycomb group (PcG) H3K27me mark plus the active mark, H3K4me. However, embryonic carcinoma (EC) cells add two key repressive marks, H3K9me2 and H3K9me3, both associated with DNA hypermethylated genes in adult cancers [6][7][8] . We hypothesize that cell chromatin patterns and transient silencing of these important growth regulatory genes in stem or progenitor cells of origin for cancer may leave these genes vulnerable to aberrant DNA hypermethylation and heritable gene silencing in adult tumors.Correspondence may be addressed to S.B.B. at sbaylin@jhmi.edu. Competing Interests Statement. The commercial rights to the MSP technique belong to Oncomethylome. S.B.B and J.G.H. serve as consultants to Oncomethylome and is entitled to royalties from any commercial use of this procedure. Epigenetic gene silencing and associated promoter CpG island DNA hypermethylation are prevalent in all cancer types, and provide an alternative mechanism to mutations by which tumor suppressor genes may be inactivated within a cancer cell [3][4][5] . These epigenetic changes may precede genetic changes in pre-malignant cells and foster the accumulation of additional genetic and epigenetic hits 9 . Adult cancers may derive from stem or early progenitor cells 1, 2 , and epigenetic modulation of gene expression is essential for normal function of these early cells. We now explore whether DNA hypermethylation and heritable silencing of groups of genes in adult tumor initiation and progression might reflect chromatin properties for these genes associated with a stem or precursor cell of origin. NIH Public AccessWe compared the epigenetic status of a group of genes frequently hypermethylated and silenced in adult cancers ( Fig. 1-all (Fig. 1). Among the genes studied, 13 of 29 (45%) are hypermethylated in a single line, HCT-116, of adult colon cancer, but none are hypermethylated in ES cells, and only 3% and 7% were completely methylated in the Tera-1 and Tera-2 EC lines, respectively. Thus, the key epigenetic parameter of promoter CpG island hypermethylation which is common in a large group of genes in adult cancer cells does not seem to be a common feature of EC cells.In murine ES cells, many developmental genes are maintained in a state of low transcriptional activity and are available for transcription increases or decreases when differentiation cues are received 11 . Our s...
The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of annotated genomic, transcript and protein sequence records derived from data in public sequence archives and from computation, curation and collaboration (http://www.ncbi.nlm.nih.gov/refseq/). We report here on growth of the mammalian and human subsets, changes to NCBI’s eukaryotic annotation pipeline and modifications affecting transcript and protein records. Recent changes to NCBI’s eukaryotic genome annotation pipeline provide higher throughput, and the addition of RNAseq data to the pipeline results in a significant expansion of the number of transcripts and novel exons annotated on mammalian RefSeq genomes. Recent annotation changes include reporting supporting evidence for transcript records, modification of exon feature annotation and the addition of a structured report of gene and sequence attributes of biological interest. We also describe a revised protein annotation policy for alternatively spliced transcripts with more divergent predicted proteins and we summarize the current status of the RefSeqGene project.
The class III histone deactylase (HDAC), SIRT1, has cancer relevance because it regulates lifespan in multiple organisms, down-regulates p53 function through deacetylation, and is linked to polycomb gene silencing in Drosophila. However, it has not been reported to mediate heterochromatin formation or heritable silencing for endogenous mammalian genes. Herein, we show that SIRT1 localizes to promoters of several aberrantly silenced tumor suppressor genes (TSGs) in which 5′ CpG islands are densely hypermethylated, but not to these same promoters in cell lines in which the promoters are not hypermethylated and the genes are expressed. Heretofore, only type I and II HDACs, through deactylation of lysines 9 and 14 of histone H3 (H3-K9 and H3-K14, respectively), had been tied to the above TSG silencing. However, inhibition of these enzymes alone fails to re-activate the genes unless DNA methylation is first inhibited. In contrast, inhibition of SIRT1 by pharmacologic, dominant negative, and siRNA (small interfering RNA)–mediated inhibition in breast and colon cancer cells causes increased H4-K16 and H3-K9 acetylation at endogenous promoters and gene re-expression despite full retention of promoter DNA hypermethylation. Furthermore, SIRT1 inhibition affects key phenotypic aspects of cancer cells. We thus have identified a new component of epigenetic TSG silencing that may potentially link some epigenetic changes associated with aging with those found in cancer, and provide new directions for therapeutically targeting these important genes for re-expression.
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Custom implementations of the BLAST program provide sequence-based searching of many specialized datasets. New resources released in the past year include a new PubMed interface, a sequence database search and a gene orthologs page. Additional resources that were updated in the past year include PMC, Bookshelf, My Bibliography, Assembly, RefSeq, viral genomes, the prokaryotic genome annotation pipeline, Genome Workbench, dbSNP, BLAST, Primer-BLAST, IgBLAST and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
It is increasingly apparent that cancer development not only depends on genetic alterations but on an abnormal cellular memory, or epigenetic changes, which convey heritable gene expression patterns critical for neoplastic initiation and progression. These aberrant epigenetic mechanisms are manifest in both global changes in chromatin packaging and in localized gene promoter changes that influence the transcription of genes important to the cancer process. An exciting emerging theme is that an understanding of stem cell chromatin control of gene expression, including relationships between histone modifications and DNA methylation, may hold a key to understanding the origins of cancer epigenetic changes. This possibility, coupled with the reversible nature of epigenetics, has enormous significance for the prevention and control of cancer.
Histone H3 lysine 9 (H3K9) and lysine 27 (H3K27) trimethylation are properties of stably silenced heterochromatin whereas H3K9 dimethylation (H3K9me2) is important for euchromatic gene repression. In colorectal cancer cells, all of these marks, as well as the key enzymes which establish them, surround the hMLH1 promoter when it is DNA hypermethylated and aberrantly silenced, but are absent when the gene is unmethylated and fully expressed in a euchromatic state. When the aberrantly silenced gene is DNA demethylated and reexpressed following 5-aza-2V -deoxycytidine treatment, H3K9me1 and H3K9me2 are the only silencing marks that are lost. A series of other silenced and DNA hypermethylated gene promoters behave identically even when the genes are chronically DNA demethylated and reexpressed after genetic knockout of DNA methyltransferases. Our data indicate that when transcription of DNA hypermethylated genes is activated in cancer cells, their promoters remain in an environment with certain heterochromatic characteristics. This finding has important implications for the translational goal of reactivating aberrantly silenced cancer genes as a therapeutic maneuver. (Cancer Res 2006; 66(7): 3541-9)
Many DNA hypermethylated and epigenetically silenced genes in adult cancers are Polycomb group (PcG) marked in embryonic stem (ES) cells. We show that a large region upstream (∼30 kb) of and extending ∼60 kb around one such gene, GATA-4, is organized—in Tera-2 undifferentiated embryonic carcinoma (EC) cells—in a topologically complex multi-loop conformation that is formed by multiple internal long-range contact regions near areas enriched for EZH2, other PcG proteins, and the signature PcG histone mark, H3K27me3. Small interfering RNA (siRNA)–mediated depletion of EZH2 in undifferentiated Tera-2 cells leads to a significant reduction in the frequency of long-range associations at the GATA-4 locus, seemingly dependent on affecting the H3K27me3 enrichments around those chromatin regions, accompanied by a modest increase in GATA-4 transcription. The chromatin loops completely dissolve, accompanied by loss of PcG proteins and H3K27me3 marks, when Tera-2 cells receive differentiation signals which induce a ∼60-fold increase in GATA-4 expression. In colon cancer cells, however, the frequency of the long-range interactions are increased in a setting where GATA-4 has no basal transcription and the loops encompass multiple, abnormally DNA hypermethylated CpG islands, and the methyl-cytosine binding protein MBD2 is localized to these CpG islands, including ones near the gene promoter. Removing DNA methylation through genetic disruption of DNA methyltransferases (DKO cells) leads to loss of MBD2 occupancy and to a decrease in the frequency of long-range contacts, such that these now more resemble those in undifferentiated Tera-2 cells. Our findings reveal unexpected similarities in higher order chromatin conformation between stem/precursor cells and adult cancers. We also provide novel insight that PcG-occupied and H3K27me3-enriched regions can form chromatin loops and physically interact in cis around a single gene in mammalian cells. The loops associate with a poised, low transcription state in EC cells and, with the addition of DNA methylation, completely repressed transcription in adult cancer cells.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.