2020
DOI: 10.1002/asi.24431
|View full text |Cite
|
Sign up to set email alerts
|

Term position‐based language model for information retrieval

Abstract: Term position feature is widely and successfully used in IR and Web search engines, to enhance the retrieval effectiveness. This feature is essentially used for two purposes: to capture query terms proximity or to boost the weight of terms appearing in some parts of a document. In this paper, we are interested in this second category. We propose two novel query‐independent techniques based on absolute term positions in a document, whose goal is to boost the weight of terms appearing in the beginning of a docum… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 31 publications
(63 reference statements)
0
5
0
Order By: Relevance
“…Therefore, it is necessary to evaluate the effect of going beyond such assumptions by experiments. Hence, possible topics of future study include combining PBR models with term association methods, such as cross terms (Zhao et al, 2014) or MRF techniques (Metzler & Croft, 2005) that incorporate various degrees of term dependencies, and the use of term position features (Hammache & Boughanem, 2021). Furthermore, we have ignored queryindependent features (by Assumption 3).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, it is necessary to evaluate the effect of going beyond such assumptions by experiments. Hence, possible topics of future study include combining PBR models with term association methods, such as cross terms (Zhao et al, 2014) or MRF techniques (Metzler & Croft, 2005) that incorporate various degrees of term dependencies, and the use of term position features (Hammache & Boughanem, 2021). Furthermore, we have ignored queryindependent features (by Assumption 3).…”
Section: Discussionmentioning
confidence: 99%
“…Apart from these models, some approaches that go beyond bag-of-words have been shown to be effective. These include the cross term approach that models term association (Zhao et al, 2014), term proximity matching methods such as Markov random field (MRF) (Metzler & Croft, 2005), and also the use of term position features (Hammache & Boughanem, 2021). As the objective of this paper is to demonstrate the effectiveness of the new bag-of-words PBR models in a pilot study, we restrict to bag-of-words baselines for a fair comparison.…”
Section: Other Methodsmentioning
confidence: 99%
“…As the documents in the MLIA corpus are long, we only consider the first N sentences for inference, where N denotes average number of sentences in documents of the corpus. Moreover, previous research [ 29 , 30 ] shows that any relevant document is likely to contain relevant sentences at the beginning of the document. The document-level relevance score is determined by aggregating the top k scoring sentences in the document as follows: where BiScore(d) is the document-level relevance score for document d using the bi-encoder model.…”
Section: Multistage Bicross Encodermentioning
confidence: 99%
“…For example, the challenge of vocabulary mismatch, and hence the importance of semantic matching, may be amplified when retrieving shorter text [95][96][97]. Similarly, when matching the query against longer text, it is informative to consider the positions of the matches [98][99][100], but may be less so in the case of short text matching. When specifically dealing with long text, the compute and memory requirements may be significantly higher for machine learned systems (e.g., [101]) and require careful design choices for mitigation.…”
Section: Robustness To Variable Length Textmentioning
confidence: 99%