2015
DOI: 10.1093/database/bav005
|View full text |Cite
|
Sign up to set email alerts
|

Automatic concept recognition using the Human Phenotype Ontology reference and test suite corpora

Abstract: Concept recognition tools rely on the availability of textual corpora to assess their performance and enable the identification of areas for improvement. Typically, corpora are developed for specific purposes, such as gene name recognition. Gene and protein name identification are longstanding goals of biomedical text mining, and therefore a number of different corpora exist. However, phenotypes only recently became an entity of interest for specialized concept recognition systems, and hardly any annotated tex… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
53
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
4
3
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 59 publications
(53 citation statements)
references
References 24 publications
0
53
0
Order By: Relevance
“…With recent advances in personalized medicine, it is becoming increasingly important to provide a computational foundation for phenotype-driven analysis of genomes and other translational research in other fields of medicine. Consequently, we have extended our work to common human disease phenotypes by means of a text-mining approach (9) toward analyzing the 2014 PubMed corpus, which allowed us to infer 132 620 HPO annotations for 3145 common diseases (10). These annotations were validated against a manually curated subset of disorders and experimental results showed an overall precision of 67%.…”
Section: Hpo: New Terms Annotations and Ontology Integrationmentioning
confidence: 99%
“…With recent advances in personalized medicine, it is becoming increasingly important to provide a computational foundation for phenotype-driven analysis of genomes and other translational research in other fields of medicine. Consequently, we have extended our work to common human disease phenotypes by means of a text-mining approach (9) toward analyzing the 2014 PubMed corpus, which allowed us to infer 132 620 HPO annotations for 3145 common diseases (10). These annotations were validated against a manually curated subset of disorders and experimental results showed an overall precision of 67%.…”
Section: Hpo: New Terms Annotations and Ontology Integrationmentioning
confidence: 99%
“…Negation detection for HPO terms is provided by the NegEx algorithm [28]. An evaluation of the system over a Pubmed corpus is described in [29]. Here, Biolark achieved an F1 score of 0.95 over a test set of 1 933 instances, corresponding to 460 unique HPO concepts.…”
Section: Biomedical Entity Extraction Bio-yodie and Biolarkmentioning
confidence: 99%
“…For curation of ontology-based genotype–phenotype associations (including disease-phenotypic profiles), we are transitioning to the WebPhenote platform (http://create.monarchinitiative.org), which allows a variety of disease entities to be connected to phenotypic descriptors. We also make use of text mining to create seed disease-phenotype associations using the Bio-Lark toolkit (39), which are then manually curated. Most recently, we have performed a large-scale annotation of PubMed to extract common disease-phenotype associations (40).…”
Section: Introductionmentioning
confidence: 99%