A Discriminative Semantic Ranker for Question Retrieval

Cai, Yinqiong; Fan, Yixing; Guo, Jiafeng; Zhang, Ruqing; Lan, Yanyan; Cheng, Xueqi

doi:10.1145/3471158.3472227

Cited by 4 publications

(6 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Following this, the paper [115] propose a new method for training dense retrieval models by using pseudo-relevance feedback and multiple representations, which allows the model to learn more robust representations of queries. To further improve performance, [116] proposed a new method for training a discriminative semantic ranker for question retrieval. This approach focuses on training the model to differentiate between relevant and non-relevant documents, which is important for accurate retrieval.…”

Section: Dense Retrieval Methodsmentioning

confidence: 99%

Information Retrieval: Recent Advances and Beyond

Hambarde¹,

Proença²

2023

Preprint

View full text Add to dashboard Cite

In this paper, we provide a detailed overview of the models used for information retrieval in the first and second stages of the typical processing chain. We discuss the current state-of-the-art models, including methods based on terms, semantic retrieval, and neural. Additionally, we delve into the key topics related to the learning process of these models. This way, this survey offers a comprehensive understanding of the field and is of interest for for researchers and practitioners entering/working in the information retrieval domain.

show abstract

Section: Dense Retrieval Methodsmentioning

confidence: 99%

Information Retrieval: Recent Advances and Beyond

Hambarde¹,

Proença²

2023

Preprint

View full text Add to dashboard Cite

show abstract

“…As researchers strive to improve information retrieval, they have developed methods to train models that can effectively distinguish between relevant and non-relevant documents in dense retrieval settings. Cai et al [109] suggested a method for training a discriminative semantic ranker for question retrieval, focusing on this crucial aspect of accurate retrieval. To further refine the understanding of the relationship between passages and queries, Wu et al [110] proposed a representation decoupling method that improves open-domain passage retrieval by separating the encoding of passages and queries.…”

Section: ) Discriminative Semantic Ranking Approaches In Dense Retrievalmentioning

confidence: 99%

Information Retrieval: Recent Advances and Beyond

Hambarde

Proença

2023

IEEE Access

View full text Add to dashboard Cite

This paper provides an extensive and thorough overview of the models and techniques utilized in the first and second stages of the typical information retrieval processing chain. Our discussion encompasses the current state-of-the-art models, covering a wide range of methods and approaches in the field of information retrieval. We delve into the historical development of these models, analyze the key advancements and breakthroughs, and address the challenges and limitations faced by researchers and practitioners in the domain. By offering a comprehensive understanding of the field, this survey is a valuable resource for researchers, practitioners, and newcomers to the information retrieval domain, fostering knowledge growth, innovation, and the development of novel ideas and techniques.INDEX TERMS First-stage retrieval, information retrieval, second-stage retrieval.

show abstract

“…There are multiple architectures for BERT-based ranking models, including cross-encoder [12], dual encoders [2] and ColBERT [8]. Comparing with other architectures, cross-encoder can achieve the best performance, while its inference latency is the highest comparing with the other ones.…”

Section: Related Workmentioning

confidence: 99%

“…Most of existing approaches [7,[15][16][17]22] for language model distillation aim at improving the performance of student model in general natural language understanding (NLU) tasks [21]. Specially for document retrieval and ranking tasks, some latest approaches are focusing on cross-architecture distillation approaches [5,10] using cross-encoder [12] teachers and ColBERT [8] or dual-encoder [2] based students. For those cross-architecture approaches, both teachers and students are using pre-trained models in the same scale (i.e.…”

mentioning

confidence: 99%

An Empirical Study of Uniform-Architecture Knowledge Distillation in Document Ranking

Qin¹,

Liu²,

Zheng³

et al. 2023

Preprint

View full text Add to dashboard Cite

Although BERT-based ranking models have been commonly used in commercial search engines, they are usually time-consuming for online ranking tasks. Knowledge distillation, which aims at learning a smaller model with comparable performance to a larger model, is a common strategy for reducing the online inference latency. In this paper, we investigate the effect of different loss functions for uniform-architecture distillation of BERT-based ranking models. Here "uniform-architecture" denotes that both teacher and student models are in cross-encoder architecture, while the student models include small-scaled pre-trained language models. Our experimental results reveal that the optimal distillation configuration for ranking tasks is much different than general natural language processing tasks. Specifically, when the student models are in cross-encoder architecture, a pairwise loss of hard labels is critical for training student models, whereas the distillation objectives of intermediate Transformer layers may hurt performance. These findings emphasize the necessity of carefully designing a distillation strategy (for cross-encoder student models) tailored for document ranking with pairwise training samples. INTRODUCTIONRecent years have witnessed great progress of applying deep learning methods to information retrieval tasks [19]. In particular, on document ranking, pre-trained language models (PLM), such as BERT [4], have achieved state-of-the art performance. However, because these pre-trained models often have a large number of parameters, they incur an inevitable computational cost and latency during the inference stage [6]. This problem will be even severe when deploying pre-trained models in latency-sensitive online ranking tasks. To tackle this problem, numerous PLM-based knowledge distillation (KD) methods [14] have been widely studied. The principle of knowledge distillation can be summarized as

show abstract

A Discriminative Semantic Ranker for Question Retrieval

Cited by 4 publications

References 46 publications

Information Retrieval: Recent Advances and Beyond

Information Retrieval: Recent Advances and Beyond

Information Retrieval: Recent Advances and Beyond

An Empirical Study of Uniform-Architecture Knowledge Distillation in Document Ranking

Contact Info

Product

Resources

About