2020
DOI: 10.48550/arxiv.2008.09093
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PARADE: Passage Representation Aggregation for Document Reranking

Abstract: We present PARADE, an end-to-end Transformer-based model that considers document-level context for document reranking. PARADE leverages passage-level relevance representations to predict a document relevance score, overcoming the limitations of previous approaches that perform inference on passages independently. Experiments on two ad-hoc retrieval benchmarks demonstrate PARADE's effectiveness over such methods. We conduct extensive analyses on PARADE's efficiency, highlighting several strategies for improving… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
50
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 25 publications
(50 citation statements)
references
References 25 publications
0
50
0
Order By: Relevance
“…To test the effectiveness of our proposed dense PRF approach, we compare with four families of baseline models, for which we vary the use of a BERT-based reranker (namely BERT or ColBERT). For the BERT reranker, we use OpenNIR [21] and capreolus/ bert-base-msmarco fine-tuned model from [19]. For the ColBERT reranker, unless otherwise noted, we use the existing pre-indexed ColBERT representation of documents for efficient reranking.…”
Section: Baselinesmentioning
confidence: 99%
“…To test the effectiveness of our proposed dense PRF approach, we compare with four families of baseline models, for which we vary the use of a BERT-based reranker (namely BERT or ColBERT). For the BERT reranker, we use OpenNIR [21] and capreolus/ bert-base-msmarco fine-tuned model from [19]. For the ColBERT reranker, unless otherwise noted, we use the existing pre-indexed ColBERT representation of documents for efficient reranking.…”
Section: Baselinesmentioning
confidence: 99%
“…PARADE model divides a document into a number of segments and encodes each of them using Transformer. Encoded segments are again fed to another Transformer to get the final document level score [9] * . QDS-Transformer encodes the long texts with fixed patterns of attentions which allow local attention among neighboring content tokens, and combines them with long distance attention [5].…”
Section: Arxiv:210904611v1 [Csir] 10 Sep 2021 2 Related Workmentioning
confidence: 99%
“…The first addresses the problem by dividing the document tokens into segments of similar length and applying the self-attention mechanism only within those segments (local self-attention) then aggregating vectors from segments to get the final score. [1,5,9,10]. A second approach also starts by dividing the document tokens into segments and treating each of them as if it were a document in its own right.…”
Section: Introductionmentioning
confidence: 99%
“…We use the keyword version of queries, corresponding to the title fields of TREC topics [14,31]. We experimented with vanilla BERT [15] as the neural ranking model, as it is the core of recent state-of-the-art IR methods [14,29,26]. To the best of our knowledge, most text-based IR neural models are trained with a pointwise or pairwise loss [26,29].…”
Section: Experiments On Text-based Ir (Rq4)mentioning
confidence: 99%
“…We experimented with vanilla BERT [15] as the neural ranking model, as it is the core of recent state-of-the-art IR methods [14,29,26]. To the best of our knowledge, most text-based IR neural models are trained with a pointwise or pairwise loss [26,29]. A challenge in this experiment was then to use a listwise loss on a BERT model.…”
Section: Experiments On Text-based Ir (Rq4)mentioning
confidence: 99%