This work proposes the use of a pretrained sequence-to-sequence model for document ranking. Our approach is fundamentally different from a commonly adopted classificationbased formulation based on encoder-only pretrained transformer architectures such as BERT. We show how a sequence-to-sequence model can be trained to generate relevance labels as "target tokens", and how the underlying logits of these target tokens can be interpreted as relevance probabilities for ranking. Experimental results on the MS MARCO passage ranking task show that our ranking approach is superior to strong encoderonly models. On three other document retrieval test collections, we demonstrate a zeroshot transfer-based approach that outperforms previous state-of-the-art models requiring indomain cross-validation. Furthermore, we find that our approach significantly outperforms an encoder-only architecture in a data-poor setting. We investigate this observation in more detail by varying target tokens to probe the model's use of latent knowledge. Surprisingly, we find that the choice of target tokens impacts effectiveness, even for words that are closely related semantically. This finding sheds some light on why our sequence-to-sequence formulation for document ranking is effective. Code and models are available at pygaggle.ai.
Single-image dehazing is an important low-level vision task with many applications. Early studies have investigated different kinds of visual priors to address this problem. However, they may fail when their assumptions are not valid on specific images. Recent deep networks also achieve a relatively good performance in this task. But unfortunately, due to the disappreciation of rich physical rules in hazes, a large amount of data are required for their training. More importantly, they may still fail when there exist completely different haze distributions in testing images. By considering the collaborations of these two perspectives, this paper designs a novel residual architecture to aggregate both prior (i.e., domain knowledge) and data (i.e., haze distribution) information to propagate transmissions for scene radiance estimation. We further present a variational energy-based perspective to investigate the intrinsic propagation behavior of our aggregated deep model. In this way, we actually bridge the gap between prior-driven models and data-driven networks and leverage advantages but avoid limitations of previous dehazing approaches. A lightweight learning framework is proposed to train our propagation network. Finally, by introducing a task-aware image separation formulation with a flexible optimization scheme, we extend the proposed model for more challenging vision tasks, such as underwater image enhancement and single-image rain removal. Experiments on both synthetic and real-world images demonstrate the effectiveness and efficiency of the proposed framework.
A k-extended Langford sequence of defect d and length m is a sequence s 1 , s 2 , ..., s 2m+1 in which s k ==, where = is the null symbol, and each other member of the sequence comes from the set [d, d+1, ..., d+m&1]. Each j # [d, d+1, ..., d+m&1] occurs exactly twice in the sequence, and the two occurrences are separated by exactly j&1 symbols. In this paper we prove that when d=2, 3, the necessary conditions for the existence of such a sequence are sufficient. Academic Press(1) m 2d&1 and(2) m#0, 1 (mod 4) for d odd and m#0, 3 (mod 4) for d even.
This work proposes a novel adaptation of a pretrained sequence-to-sequence model to the task of document ranking. Our approach is fundamentally different from a commonly-adopted classification-based formulation of ranking, based on encoder-only pretrained transformer architectures such as BERT. We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words", and how the underlying logits of these target words can be interpreted as relevance probabilities for ranking. On the popular MS MARCO passage ranking task, experimental results show that our approach is at least on par with previous classification-based models and can surpass them with larger, more-recent models. On the test collection from the TREC 2004 Robust Track, we demonstrate a zero-shot transfer-based approach that outperforms previous state-of-the-art models requiring in-dataset cross-validation. Furthermore, we find that our approach significantly outperforms an encoder-only model in a data-poor regime (i.e., with few training examples). We investigate this observation further by varying target words to probe the model's use of latent knowledge.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.