2017
DOI: 10.1016/j.jbi.2017.09.014
|View full text |Cite
|
Sign up to set email alerts
|

Bridging the gap: Incorporating a semantic similarity measure for effectively mapping PubMed queries to documents

Abstract: The main approach of traditional information retrieval (IR) is to examine how many words from a query appear in a document. A drawback of this approach, however, is that it may fail to detect relevant documents where no or only few words from a query are found. The semantic analysis methods such as LSA (latent semantic analysis) and LDA (latent Dirichlet allocation) have been proposed to address the issue, but their performance is not superior compared to common IR approaches. Here we present a query-document … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
38
0
1

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 48 publications
(40 citation statements)
references
References 38 publications
1
38
0
1
Order By: Relevance
“…This approach for learning bilingual word embeddings is referred to as Bilingual word Embeddings Skip-Gram (BWESG). Azimi et al (2015), Zhai et al (2016a, b) 8.2 Learn Learn to predict Grbovic et al (2015a, b), Zhang et al (2016a) 9 Similar item retrieval 9.1.1 Related document search Aggregate Explicit Kim et al (2016), Kusner et al (2014) 9.1.2 Learn Learn to autoencode Salakhutdinov and Hinton (2009) Learn to predict Dai et al (2014), Djuric et al (2015), Le and Mikolov (2014) 9.2 Detecting text reuse Aggregate Explicit Zhang et al (2014) 9.3.1 Similar Question Retrieval…”
Section: Aggregatementioning
confidence: 99%
See 1 more Smart Citation
“…This approach for learning bilingual word embeddings is referred to as Bilingual word Embeddings Skip-Gram (BWESG). Azimi et al (2015), Zhai et al (2016a, b) 8.2 Learn Learn to predict Grbovic et al (2015a, b), Zhang et al (2016a) 9 Similar item retrieval 9.1.1 Related document search Aggregate Explicit Kim et al (2016), Kusner et al (2014) 9.1.2 Learn Learn to autoencode Salakhutdinov and Hinton (2009) Learn to predict Dai et al (2014), Djuric et al (2015), Le and Mikolov (2014) 9.2 Detecting text reuse Aggregate Explicit Zhang et al (2014) 9.3.1 Similar Question Retrieval…”
Section: Aggregatementioning
confidence: 99%
“…Their model outperforms these baselines. Kim et al (2016) propose another version of WMD specific to query-document similarity. The high computational cost of WMD is tackled by mapping queries to documents using a word embedding model trained on a document set.…”
Section: Learn To Predictmentioning
confidence: 99%
“…• Query-document Similarity (QD): Responses are generally much longer than Key Points and WM-distance may assign unfairly low similarity scores to responses with extra information. To address this issue, we use metrics designed for information retrieval (Kim et al, 2016). For each word in the Key Point, the algorithm finds the word with the maximum similarity from a response n-gram, where the similarity score is the cosine similarity between two corresponding word embeddings.…”
Section: Models Based On Word-embedding Featuresmentioning
confidence: 99%
“…Although these methods have been widely applied for measuring the degree of paraphrases between two given texts, just [17] evaluates its relevance for plagiarism detection. More recently, [5,12] discussed the use of semantic information without depending on any external knowledge resource. Particularly, they proposed using distributive representations, such as word2vec [15], in the task of plagiarism detection.…”
Section: Introductionmentioning
confidence: 99%