Proceedings of the BioNLP 2018 Workshop 2018
DOI: 10.18653/v1/w18-2313
|View full text |Cite
|
Sign up to set email alerts
|

Phrase2VecGLM: Neural generalized language model–based semantic tagging for complex query reformulation in medical IR

Abstract: In fact-based information retrieval, stateof-the-art performance is traditionally achieved by knowledge graphs driven by knowledge bases, as they can represent facts about and capture relationships between entities very well. However, in domains such as medical information retrieval, where addressing specific information needs of complex queries may require understanding query intent by capturing novel associations between potentially latent concepts, these systems can fall short. In this work, we develop a no… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 31 publications
(50 reference statements)
0
7
0
Order By: Relevance
“…We develop a novel sequence-to-set end-toend encoder-decoder-based neural framework for multi-label prediction, by training document representations using no external supervision labels, for pseudo-relevance feedback-based unsupervised semantic tagging of a large collection of documents. We find that in this unsupervised task setting of PRF-based semantic tagging for query expansion, a multi-term prediction training objective that jointly optimizes both prediction of the TFIDF-based document pseudo-labels and the log likelihood of the labels given the document encoding, surpasses previous methods such as Phrase2VecGLM (Das et al, 2018) that used neural generalized language models for the same. Our initial hypothesis that bidirectional or self-attentional models could learn the most efficient semantic representations of documents when coupled with a loss more effective than cross-entropy at reducing language model perplexity of document encodings, is corroborated in all experimental setups.…”
Section: Discussionmentioning
confidence: 81%
See 2 more Smart Citations
“…We develop a novel sequence-to-set end-toend encoder-decoder-based neural framework for multi-label prediction, by training document representations using no external supervision labels, for pseudo-relevance feedback-based unsupervised semantic tagging of a large collection of documents. We find that in this unsupervised task setting of PRF-based semantic tagging for query expansion, a multi-term prediction training objective that jointly optimizes both prediction of the TFIDF-based document pseudo-labels and the log likelihood of the labels given the document encoding, surpasses previous methods such as Phrase2VecGLM (Das et al, 2018) that used neural generalized language models for the same. Our initial hypothesis that bidirectional or self-attentional models could learn the most efficient semantic representations of documents when coupled with a loss more effective than cross-entropy at reducing language model perplexity of document encodings, is corroborated in all experimental setups.…”
Section: Discussionmentioning
confidence: 81%
“…We ran several sets of experiments with various document encoders, employing pre-trained and fine-tuned embedding schemes like skip-gram (Mikolov et al, 2013a) and Probabilistic Fast-Text (Athiwaratkun et al, 2018), see Appendix B. The experimental setup used is the same as the Phrase2VecGLM (Das et al, 2018), the only other known system for this dataset, that performs "unsupervised semantic tagging of documents by PRF", for downstream query expansion. Thus we take this system as the current state-of-the-art system baseline, while our non-attention-based document encoding models constitute our standard baselines.…”
Section: Unsupervised Task Experimentsmentioning
confidence: 99%
See 1 more Smart Citation
“…In a related medical IR challenge (Roberts et al 2017) the authors specifically mention that with only six partially annotated queries for system development, it is likely that systems were either under-or over-tuned on these queries. Since the setup of the seq2set framework is an attempt to model the PRF based query expansion method of its closest related work (Das et al 2018) where the effort is also to train a neural generalized language model for unsupervised semantic tagging, we choose this system as the benchmark to compare against to our end-to-end approach for the same task.…”
Section: Related Workmentioning
confidence: 99%
“…Experiments -Unsupervised Task Setting We ran several sets of experiments with various document encoders, employing word embedding schemes like skipgram (Mikolov et al 2013) andProbabilistic Fasttext (Athiwaratkun, Wilson, andAnandkumar 2018). The experimental setup used is the same as the Phrase2VecGLM (Das et al 2018), the only other known system for this dataset, that performs "unsupervised semantic tagging of documents by PRF", for downstream query expansion. Thus we take this system as the current state-of-the-art system baseline, while our non-attention-based document encoding models constitute our standard baselines.…”
Section: Task Settings Semantic Tagging For Query Expansionmentioning
confidence: 99%