2012
DOI: 10.1136/amiajnl-2011-000503
|View full text |Cite
|
Sign up to set email alerts
|

Usability-driven pruning of large ontologies: the case of SNOMED CT

Abstract: Graph-traversal strategies and frequency data from an authoritative source can prune large biomedical ontologies and produce useful subsets that still exhibit acceptable coverage. However, a clinical corpus closer to the specific use case is preferred when available.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
17
0

Year Published

2013
2013
2018
2018

Publication Types

Select...
6

Relationship

2
4

Authors

Journals

citations
Cited by 10 publications
(17 citation statements)
references
References 11 publications
0
17
0
Order By: Relevance
“…In considering networks, the common nomenclature (UMLS) is a highly connected network ideal for finding relatedness between professions. A different network, Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) [51,52], has relationships designed as an a-cyclical tree, which would enable easy comparison between two terms but would increase the distance between any two terms. The full network of UMLS is more representative of the complex nature of biomedicine.…”
Section: Discussionmentioning
confidence: 99%
“…In considering networks, the common nomenclature (UMLS) is a highly connected network ideal for finding relatedness between professions. A different network, Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) [51,52], has relationships designed as an a-cyclical tree, which would enable easy comparison between two terms but would increase the distance between any two terms. The full network of UMLS is more representative of the complex nature of biomedicine.…”
Section: Discussionmentioning
confidence: 99%
“…However, this may go against government or AMA policies. An alternative straightforward approach could be to conduct double coding (ICD-9-CM and ICD-10-CM) for the entangled ICD-9-CM codes and compare motifs in ICD-9-CM and ICD-10-CM in the final reports of the medical system or clinics, such as graph-pruning strategies to subsets offering reasonable coverage 24. However, dual coding is cost-prohibitive as coding to ICD-10-CM codes may require additional patient information that is available in patient charts but unobtainable from the historical ICD-9-CM claims.…”
Section: Discussionmentioning
confidence: 99%
“…Ontology modularization techniques include graph-traversal [1114] and logic-based techniques [15,16], whose extraction strategies depend on the ontology’s topology and its definitional axioms, respectively. The requirement of preserving the original ontology entailments, key to ontology modularization, however, adds a large number of terms to the module that are unlikely to be found in clinical documents, necessarily affecting precision and performance [17]. …”
Section: Introductionmentioning
confidence: 99%
“…One of the most relevant examples of such a subset is the CORE problem list subset of SNOMED CT (CORE) [20], which is only 1.50% the size of SNOMED CT but covers over 90% of the diagnoses and problem lists found in existing reference datasets. Public authoritative medical corpora are very scarce (with the notable exception of the Multiparameter Intelligent Monitoring in Intensive Care II clinical database [21]), and using a generalist corpus (e.g., MEDLINE) might not provide good enough results because of potential mismatch between content and vocabulary used in scientific abstracts and clinical jargon [17]. To extract the CORE subset, seven large-scale health care institutions collaborated to analyze their datasets.…”
Section: Introductionmentioning
confidence: 99%