The current version of the Human Disease Ontology (DO) (http://www.disease-ontology.org) database expands the utility of the ontology for the examination and comparison of genetic variation, phenotype, protein, drug and epitope data through the lens of human disease. DO is a biomedical resource of standardized common and rare disease concepts with stable identifiers organized by disease etiology. The content of DO has had 192 revisions since 2012, including the addition of 760 terms. Thirty-two percent of all terms now include definitions. DO has expanded the number and diversity of research communities and community members by 50+ during the past two years. These community members actively submit term requests, coordinate biomedical resource disease representation and provide expert curation guidance. Since the DO 2012 NAR paper, there have been hundreds of term requests and a steady increase in the number of DO listserv members, twitter followers and DO website usage. DO is moving to a multi-editor model utilizing Protégé to curate DO in web ontology language. This will enable closer collaboration with the Human Phenotype Ontology, EBI's Ontology Working Group, Mouse Genome Informatics and the Monarch Initiative among others, and enhance DO's current asserted view and multiple inferred views through reasoning.
The Human Phenotype Ontology (HPO) is widely used in the rare disease community for differential diagnostics, phenotype-driven analysis of next-generation sequence-variation data, and translational research, but a comparable resource has not been available for common disease. Here, we have developed a concept-recognition procedure that analyzes the frequencies of HPO disease annotations as identified in over five million PubMed abstracts by employing an iterative procedure to optimize precision and recall of the identified terms. We derived disease models for 3,145 common human diseases comprising a total of 132,006 HPO annotations. The HPO now comprises over 250,000 phenotypic annotations for over 10,000 rare and common diseases and can be used for examining the phenotypic overlap among common diseases that share risk alleles, as well as between Mendelian diseases and common diseases linked by genomic location. The annotations, as well as the HPO itself, are freely available.
BackgroundThe Centre for Therapeutic Target Validation (CTTV - https://www.targetvalidation.org/) was established to generate therapeutic target evidence from genome-scale experiments and analyses. CTTV aims to support the validity of therapeutic targets by integrating existing and newly-generated data. Data integration has been achieved in some resources by mapping metadata such as disease and phenotypes to the Experimental Factor Ontology (EFO). Additionally, the relationship between ontology descriptions of rare and common diseases and their phenotypes can offer insights into shared biological mechanisms and potential drug targets. Ontologies are not ideal for representing the sometimes associated type relationship required. This work addresses two challenges; annotation of diverse big data, and representation of complex, sometimes associated relationships between concepts.MethodsSemantic mapping uses a combination of custom scripting, our annotation tool ‘Zooma’, and expert curation. Disease-phenotype associations were generated using literature mining on Europe PubMed Central abstracts, which were manually verified by experts for validity. Representation of the disease-phenotype association was achieved by the Ontology of Biomedical AssociatioN (OBAN), a generic association representation model. OBAN represents associations between a subject and object i.e., disease and its associated phenotypes and the source of evidence for that association. The indirect disease-to-disease associations are exposed through shared phenotypes. This was applied to the use case of linking rare to common diseases at the CTTV.ResultsEFO yields an average of over 80 % of mapping coverage in all data sources. A 42 % precision is obtained from the manual verification of the text-mined disease-phenotype associations. This results in 1452 and 2810 disease-phenotype pairs for IBD and autoimmune disease and contributes towards 11,338 rare diseases associations (merged with existing published work [Am J Hum Genet 97:111-24, 2015]). An OBAN result file is downloadable at http://sourceforge.net/p/efo/code/HEAD/tree/trunk/src/efoassociations/. Twenty common diseases are linked to 85 rare diseases by shared phenotypes. A generalizable OBAN model for association representation is presented in this study.ConclusionsHere we present solutions to large-scale annotation-ontology mapping in the CTTV knowledge base, a process for disease-phenotype mining, and propose a generic association model, ‘OBAN’, as a means to integrate disease using shared phenotypes.AvailabilityEFO is released monthly and available for download at http://www.ebi.ac.uk/efo/.Electronic supplementary materialThe online version of this article (doi:10.1186/s13326-016-0051-7) contains supplementary material, which is available to authorized users.
The BioSamples database at the EBI (http://www.ebi.ac.uk/biosamples) provides an integration point for BioSamples information between technology specific databases at the EBI, projects such as ENCODE and reference collections such as cell lines. The database delivers a unified query interface and API to query sample information across EBI’s databases and provides links back to assay databases. Sample groups are used to manage related samples, e.g. those from an experimental submission, or a single reference collection. Infrastructural improvements include a new user interface with ontological and key word queries, a new query API, a new data submission API, complete RDF data download and a supporting SPARQL endpoint, accessioning at the point of submission to the European Nucleotide Archive and European Genotype Phenotype Archives and improved query response times.
No abstract
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.