2016
DOI: 10.1101/039008
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Two Similarity Metrics for Medical Subject Headings (MeSH): An Aid to Biomedical Text Mining and Author Name Disambiguation

Abstract: In the present paper, we have created and characterized several similarity metrics for relating any

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 15 publications
0
4
0
Order By: Relevance
“…To characterize these articles and identify any differences, we extracted other MeSH terms associated with each of the two corpora. Smalheiser and Bonifield recently proposed a metric for quantifying the semantic relatedness of two MeSH terms through their tendency to co‐occur in the same article's MEDLINE entry 21 . We used this metric and ranked associated MeSH terms according to their odd ratio and kept the top 40 associated MeSH terms (Table ).…”
Section: Methodsmentioning
confidence: 99%
“…To characterize these articles and identify any differences, we extracted other MeSH terms associated with each of the two corpora. Smalheiser and Bonifield recently proposed a metric for quantifying the semantic relatedness of two MeSH terms through their tendency to co‐occur in the same article's MEDLINE entry 21 . We used this metric and ranked associated MeSH terms according to their odd ratio and kept the top 40 associated MeSH terms (Table ).…”
Section: Methodsmentioning
confidence: 99%
“…Apart from author name disambiguation, what other types of text modeling tasks might benefit from using our implicit term and text similarity metrics? Word2vec-based metrics have been very actively explored for a variety of extrinsic tasks such as named entity recognition, part of speech tagging, ranking PubMed articles for semantic relatedness [18,26], and word sense disambiguation [37]. One should be cautioned that performance on one set of tasks does not necessarily correlate with performance on other tasks [38,39], yet we feel that our implicit term similarity metrics should be explored to see if they have any complementary value in modeling the latter applications.…”
Section: Discussionmentioning
confidence: 99%
“…For example, recently we created journal similarity metrics that relate any two journals A and B according to a) how similar they are topically, b) how similar they are in terms of sharing the same authors, and c) how likely it is that an author who publishes in A will publish later in B [17]. We also created MeSH (Medical Subject Headings) similarity metrics that relate any two MeSH terms A and B according to a) how often the two terms co-occur in a single article, and b) how often the two terms are found within the body of articles published by a single author [18]. Such implicit metrics are valuable in text mining models such as are used in the Author-ity author name disambiguation project [19,20], where the goal is to consider two articles that share the same author [lastname, first initial] and estimate the probability that the two articles refer to the same author-individual.…”
Section: Introductionmentioning
confidence: 99%
“…In order to identify drug mechanisms supported by colocalizations and to have a quantitative assessment of tissue-trait similarity, we used similarity in the MeSH vocabulary. Similarity is determined through odds ratio of cooccurance in article MeSH terms in the PubMed corpus (accessed October 10, 2018), an approach based on [13]. For identifying similar drug mechanisms, we used an odds ratio cutoff of 20 as this corresponded on average to the similarity cutoff used in the previous work.…”
Section: Pubmed Odds Ratio Similaritymentioning
confidence: 99%