2021
DOI: 10.1101/2021.10.17.464747
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

RTX-KG2: a system for building a semantically standardized knowledge graph for translational biomedicine

Abstract: Background: Biomedical translational science is increasingly leveraging computational reasoning on large repositories of structured knowledge (such as the Unified Medical Language System (UMLS), the Semantic Medline Database (SemMedDB), ChEMBL, DrugBank, and the Small Molecule Pathway Database (SMPDB)) and data in order to facilitate discovery of new therapeutic targets and modalities. Since 2016, the NCATS Biomedical Data Translator project has been working to federate autonomous reasoning agents and knowledg… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
2

Relationship

3
2

Authors

Journals

citations
Cited by 5 publications
(9 citation statements)
references
References 116 publications
(240 reference statements)
0
9
0
Order By: Relevance
“…The KPs themselves obtain information from various knowledge sources, giving ARAX access to a combined total of more than 100 knowledge sources (e.g., UniProt, DrugBank, DisGeNET, ChEMBL, MONDO, SemMedDB). Calculating the exact count of underlying knowledge sources is difficult due to the lack of such information in structured form for some KPs, but between RTX-KG2's 73 knowledge sources [19], SPOKE's approximately 40 knowledge sources 6 , and the BioThings Explorer Service Provider's 32 integrated sources 7 , ARAX has access to at least 100 distinct underlying knowledge sources (we note that there is some overlap in knowledge sources between KPs). The true count of underlying knowledge sources is likely notably higher since ARAX uses many other KPs that were not included in the above estimate.…”
Section: Score and Rank Results Graphs: Arax_rankermentioning
confidence: 99%
See 1 more Smart Citation
“…The KPs themselves obtain information from various knowledge sources, giving ARAX access to a combined total of more than 100 knowledge sources (e.g., UniProt, DrugBank, DisGeNET, ChEMBL, MONDO, SemMedDB). Calculating the exact count of underlying knowledge sources is difficult due to the lack of such information in structured form for some KPs, but between RTX-KG2's 73 knowledge sources [19], SPOKE's approximately 40 knowledge sources 6 , and the BioThings Explorer Service Provider's 32 integrated sources 7 , ARAX has access to at least 100 distinct underlying knowledge sources (we note that there is some overlap in knowledge sources between KPs). The true count of underlying knowledge sources is likely notably higher since ARAX uses many other KPs that were not included in the above estimate.…”
Section: Score and Rank Results Graphs: Arax_rankermentioning
confidence: 99%
“…To facilitate such computational reasoning, there have been numerous efforts to integrate knowledge from various biomedical databases using knowledge graph (KG) abstractions [9, 10, 11] consisting of a labeled multigraph in which each node represents a concept and each edge represents a concept-concept relationship, i.e., a “triple”. Several such biomedical KGs have been described [12, 13, 14, 15, 16, 17, 18], including RTX-KG2, which we developed and described previously [19] and which integrates all of the aforementioned primary databases with a semantic layer that is described by the open-standard Biolink model [20]. Given such a comprehensive KG with a unified semantic layer, a biomedical question can be transformed into a query graph [21] (which represents a search pattern, analogous to a statement in the SPARQL language [22]) and/or a graph analysis workflow in order to generate a list of “answers”, each corresponding to a subgraph of the KG.…”
Section: Introductionmentioning
confidence: 99%
“…The NLP program that produces SemMedDB is SemRep (Kilicoglu et al, 2020). The RTX-KG2 knowledge graph incorporates SemMedDB, which is in turn consumed by mediKanren (Wood et al, 2021). PMI's case analysis process inspired the development of RTX-KG2 after Dr. Steve Ramsey noted how including the research article that supports an edge in the biomedical knowledge graph, especially the specific sentence excerpt on which the edge is based, is critical to enable analysts to maximally leverage human judgment in the interpretation of mediKanren results.…”
Section: Experience With the Semantic Medline Databasementioning
confidence: 99%
“…Accordingly, RTX-KG2 provides literature provenance information wherever it is available, such as for edges from SemMedDB, Jensen Lab Diseases, and RepoDB. Feedback from PMI based on experiences using RTX-KG2 within mediKanren led to multiple improvements for RTX-KG2 including improvements to semantic type annotations, the documentation of constituent sources, and additions of knowledge sources, such as DGIdb (Wood et al, 2021).…”
Section: Experience With the Semantic Medline Databasementioning
confidence: 99%
See 1 more Smart Citation