2019
DOI: 10.1101/627513
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Recapitulation and Retrospective Prediction of Biomedical Associations Using Temporally-enabled Word Embeddings

Abstract: actual: 152 words)The recent explosion of biomedical knowledge presents both a major opportunity and challenge for scientists tackling complex problems in healthcare. Here we present an approach for synthesizing biomedical knowledge based on a combination of word-embeddings and select cooccurrences. We evaluated our ability to recapitulate and retrospectively predict disease-gene associations from the Online Mendelian Inheritance in Man (OMIM) resource. Our metrics achieved an area under the curve (AUC) value … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
13
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 9 publications
(13 citation statements)
references
References 14 publications
0
13
0
Order By: Relevance
“…In order to capture biomedical literature-based associations, the nferX platform defines two scores: a ‘local score’ and a ‘global score’, as described previously ( Park et al, 2020 ). Briefly, the local score is obtained from applying a traditional natural language processing technique which captures the strength of association between two concepts in a selected corpus of biomedical literature based on the frequency of their co-occurrence normalized by the frequency of each individual concept throughout the corpus.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…In order to capture biomedical literature-based associations, the nferX platform defines two scores: a ‘local score’ and a ‘global score’, as described previously ( Park et al, 2020 ). Briefly, the local score is obtained from applying a traditional natural language processing technique which captures the strength of association between two concepts in a selected corpus of biomedical literature based on the frequency of their co-occurrence normalized by the frequency of each individual concept throughout the corpus.…”
Section: Methodsmentioning
confidence: 99%
“…Note: One key drawback of the word2vec vector cosine similarity ( Park et al, 2020 ; Mikolov et al, 2013b ) method is its inability to get scores for logical queries as described above, because the method ( Mikolov et al, 2013b ) does not address the question of how to get vectors for queries that are logical combinations of tokens.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…In order to capture biomedical literature based associations, the nferX platform defines two scores: a "local score" and a "global score", as described previously 51 . Briefly, the local score represents a traditional natural language processing technique which captures the strength of association between two concepts in a selected corpus of biomedical literature based on the frequency of their co-occurrence normalized by the frequency of each individual concept throughout the corpus.…”
Section: Unstructured Biomedical Knowledge Synthesis and Triangulatiomentioning
confidence: 99%