The Gene Ontology (GO) project (http://www. geneontology.org/) provides structured, controlled vocabularies and classifications that cover several domains of molecular and cellular biology and are freely available for community use in the annotation of genes, gene products and sequences. Many model organism databases and genome annotation groups use the GO and contribute their annotation sets to the GO resource. The GO database integrates the vocabularies and contributed annotations and provides full access to this information in several formats. Members of the GO Consortium continually work collectively, involving outside experts as needed, to expand and update the GO vocabularies. The GO Web resource also provides access to extensive documentation about the GO project and links to applications that use GO data for functional analyses.
The Gene Ontology Consortium (GOC) provides the most comprehensive resource currently available for computable knowledge regarding the functions of genes and gene products. Here, we report the advances of the consortium over the past two years. The new GO-CAM annotation framework was notably improved, and we formalized the model with a computational schema to check and validate the rapidly increasing repository of 2838 GO-CAMs. In addition, we describe the impacts of several collaborations to refine GO and report a 10% increase in the number of GO annotations, a 25% increase in annotated gene products, and over 9,400 new scientific articles annotated. As the project matures, we continue our efforts to review older annotations in light of newer findings, and, to maintain consistency with other ontologies. As a result, 20 000 annotations derived from experimental data were reviewed, corresponding to 2.5% of experimental GO annotations. The website (http://geneontology.org) was redesigned for quick access to documentation, downloads and tools. To maintain an accurate resource and support traceability and reproducibility, we have made available a historical archive covering the past 15 years of GO data with a consistent format and file structure for both the ontology and annotations.
The social amoebae are exceptional in their ability to alternate between unicellular and multicellular forms. Here we describe the genome of the best-studied member of this group, Dictyostelium discoideum. The gene-dense chromosomes encode ~12,500 predicted proteins, a high proportion of which have long repetitive amino acid tracts. There are many genes for polyketide synthases and ABC transporters, suggesting an extensive secondary metabolism for producing and exporting small molecules. The genome is rich in complex repeats, one class of which is clustered and may serve as centromeres. Partial copies of the extrachromosomal rDNA element are found at the ends of each chromosome, suggesting a novel telomere structure and the use of a common mechanism to maintain both the rDNA and chromosomal termini. A proteome-based phylogeny shows that the amoebozoa diverged from the animal/fungal lineage after the plant/animal split, but Dictyostelium appears to have retained more of the diversity of the ancestral genome than either of these two groups.The amoebozoa are a richly diverse group of organisms whose genomes remain largely unexplored. The soil-dwelling social amoeba Dictyostelium discoideum has been actively studied for the past fifty years and has contributed greatly to our understanding of cellular motility, signalling and interaction 1 . For example, studies in Dictyostelium provided the first descriptions of a eukaryotic cell chemo-attractant and a cell-cell adhesion protein 2, 3 .Dictyostelium amoebae inhabit forest soil consuming bacteria and yeast, which they track by chemotaxis. Starvation, however, prompts the solitary cells to aggregate and to develop as a true multicellular organism, producing a fruiting body comprised of a cellular, cellulosic stalk supporting a bolus of spores. Thus, Dictyostelium has evolved mechanisms that direct the differentiation of a homogeneous population of cells into distinct cell types, regulate the proportions between tissues and orchestrate the construction of an effective structure for the dispersal of spores 4 . Many of the genes necessary for these processes in Dictyostelium were Eichinger et al. Page 2 Nature. Author manuscript; available in PMC 2006 January 27. NIH-PA Author ManuscriptNIH-PA Author Manuscript NIH-PA Author Manuscript also inherited by metazoa and fashioned through evolution for use within many different modes of development.The amoebozoa are also noteworthy as representing one of the earliest branches from the last common ancestor of all eukaryotes. Each of the surviving branches of the crown group of eukaryotes provides an example of the ways in which the ancestral genome has been sculpted and adapted by lineage-specific gene duplication, divergence and deletion. Comparison between representatives of these branches promises to shed light not only on the nature and content of the ancestral eukaryotic genome, but on the diversity of ways in which its components have been adapted to meet the needs of complex organisms. The genome of Dictyosteliu...
Candidate gene and genome-wide association studies (GWAS) have identified genetic variants that modulate risk for human disease; many of these associations require further study to replicate the results. Here we report the first large-scale application of the phenome-wide association study (PheWAS) paradigm within electronic medical records (EMRs), an unbiased approach to replication and discovery that interrogates relationships between targeted genotypes and multiple phenotypes. We scanned for associations between 3,144 single-nucleotide polymorphisms (previously implicated by GWAS as mediators of human traits) and 1,358 EMR-derived phenotypes in 13,835 individuals of European ancestry. This PheWAS replicated 66% (51/77) of sufficiently powered prior GWAS associations and revealed 63 potentially pleiotropic associations with P < 4.6 × 10−6 (false discovery rate < 0.1); the strongest of these novel associations were replicated in an independent cohort (n = 7,406). These findings validate PheWAS as a tool to allow unbiased interrogation across multiple phenotypes in EMR-based cohorts and to enhance analysis of the genomic basis of human disease.
The Electronic Medical Records and Genomics (eMERGE) Network is a National Human Genome Research Institute (NHGRI)-funded consortium engaged in the development of methods and best-practices for utilizing the Electronic Medical Record (EMR) as a tool for genomic research. Now in its sixth year, its second funding cycle and comprising nine research groups and a coordinating center, the network has played a major role in validating the concept that clinical data derived from EMRs can be used successfully for genomic research. Current work is advancing knowledge in multiple disciplines at the intersection of genomics and healthcare informatics, particularly electronic phenotyping, genome-wide association studies, genomic medicine implementation and the ethical and regulatory issues associated with genomics research and returning results to study participants. Here we describe the evolution, accomplishments, opportunities and challenges of the network since its inception as a five-group consortium focused on genotype-phenotype associations for genomic discovery to its current form as a nine-group consortium pivoting towards implementation of genomic medicine.
IntroductionThe eMERGE (electronic MEdical Records and GEnomics) Network is an NHGRI-supported consortium of five institutions to explore the utility of DNA repositories coupled to Electronic Medical Record (EMR) systems for advancing discovery in genome science. eMERGE also includes a special emphasis on the ethical, legal and social issues related to these endeavors.OrganizationThe five sites are supported by an Administrative Coordinating Center. Setting of network goals is initiated by working groups: (1) Genomics, (2) Informatics, and (3) Consent & Community Consultation, which also includes active participation by investigators outside the eMERGE funded sites, and (4) Return of Results Oversight Committee. The Steering Committee, comprised of site PIs and representatives and NHGRI staff, meet three times per year, once per year with the External Scientific Panel.Current progressThe primary site-specific phenotypes for which samples have undergone genome-wide association study (GWAS) genotyping are cataract and HDL, dementia, electrocardiographic QRS duration, peripheral arterial disease, and type 2 diabetes. A GWAS is also being undertaken for resistant hypertension in ≈2,000 additional samples identified across the network sites, to be added to data available for samples already genotyped. Funded by ARRA supplements, secondary phenotypes have been added at all sites to leverage the genotyping data, and hypothyroidism is being analyzed as a cross-network phenotype. Results are being posted in dbGaP. Other key eMERGE activities include evaluation of the issues associated with cross-site deployment of common algorithms to identify cases and controls in EMRs, data privacy of genomic and clinically-derived data, developing approaches for large-scale meta-analysis of GWAS data across five sites, and a community consultation and consent initiative at each site.Future activitiesPlans are underway to expand the network in diversity of populations and incorporation of GWAS findings into clinical care.SummaryBy combining advanced clinical informatics, genome science, and community consultation, eMERGE represents a first step in the development of data-driven approaches to incorporate genomic information into routine healthcare delivery.
Although the potential for genomics to contribute to clinical care has long been anticipated, the pace of defining the risks and benefits of incorporating genomic findings into medical practice has been relatively slow. Several institutions have recently begun genomic medicine programs, encountering many of the same obstacles and developing the same solutions, often independently. Recognizing that successful early experiences can inform subsequent efforts, the National Human Genome Research Institute brought together a number of these groups to describe their ongoing projects and challenges, identify common infrastructure and research needs, and outline an implementation framework for investigating and introducing similar programs elsewhere. Chief among the challenges were limited evidence and consensus on which genomic variants were medically relevant; lack of reimbursement for genomically driven interventions; and burden to patients and clinicians of assaying, reporting, intervening, and following up genomic findings. Key infrastructure needs included an openly accessible knowledge base capturing sequence variants and their phenotypic associations and a framework for defining and cataloging clinically actionable variants. Multiple institutions are actively engaged in using genomic information in clinical care. Much of this work is being done in isolation and would benefit from more structured collaboration and sharing of best practices.Genet Med 2013:15(4):258–267
Many modern human genomes retain DNA inherited from interbreeding with archaic hominins, such as Neanderthals, yet the influence of this admixture on human traits is largely unknown. We analyzed the contribution of common Neanderthal variants to over 1,000 electronic health record (EHR)-derived phenotypes in ~28,000 adults of European ancestry. We discovered and replicated associations of Neanderthal alleles with neurological, psychiatric, immunological, and dermatological phenotypes. Neanderthal alleles together explain a significant fraction of the variation in risk for depression and skin lesions resulting from sun exposure (actinic keratosis), and individual Neanderthal alleles are significantly associated with specific human phenotypes, including hypercoagulation and tobacco use. Our results establish that archaic admixture influences disease risk in modern humans, provide hypotheses about the effects of hundreds of Neanderthal haplotypes and demonstrate the utility of EHR data in evolutionary analyses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.