Proceedings of the 29th ACM International Conference on Information &Amp; Knowledge Management 2020
DOI: 10.1145/3340531.3412079
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Re-Rank with Contextualized Stopwords

Abstract: The use of stopwords has been thoroughly studied in traditional Information Retrieval systems, but remains unexplored in the context of neural models. Neural re-ranking models take the full text of both the query and document into account. Naturally, removing tokens that do not carry relevance information provides us with an opportunity to improve the effectiveness by reducing noise and lower document representation caching-storage requirements. In this work we propose a novel contextualized stopword detection… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 13 publications
(12 reference statements)
0
7
0
Order By: Relevance
“…To further reduce the number of passage tokens to store, we adopt a simplified version of Hofstätter et al [17]'s contextualized stopwords (CS), which was first introduced for the TK-Sparse model. CS learns a removal gate of tokens solely based on their contextdependent vector representations.…”
Section: Simplified Contextualized Stopwordsmentioning
confidence: 99%
See 3 more Smart Citations
“…To further reduce the number of passage tokens to store, we adopt a simplified version of Hofstätter et al [17]'s contextualized stopwords (CS), which was first introduced for the TK-Sparse model. CS learns a removal gate of tokens solely based on their contextdependent vector representations.…”
Section: Simplified Contextualized Stopwordsmentioning
confidence: 99%
“…The original implementation [17] masks scores after TK's kernelactivation, meaning the non-zero gates have to be saved as well, which increases the systems' complexity. In contrast, we directly apply the gate to the representation vectors.…”
Section: Simplified Contextualized Stopwordsmentioning
confidence: 99%
See 2 more Smart Citations
“…However, allowing each query embedding the same chance to contribute to the candidate set may be sub-optimal. Indeed, consider a query embedding representing a stopword appearing in the query -retrieving many nearest neighbours to that query embedding is unlikely to retrieve as many relevant documents as a more discriminative query embedding [4,21] 1 .…”
Section: Rankings From the Approximate First Stagementioning
confidence: 99%