Proceedings of the 13th International Conference on Web Search and Data Mining 2020
DOI: 10.1145/3336191.3371768
|View full text |Cite
|
Sign up to set email alerts
|

Extreme Regression for Dynamic Search Advertising

Abstract: This paper introduces a new learning paradigm called eXtreme Regression (XR) whose objective is to accurately predict the numerical degrees of relevance of an extremely large number of labels to a data point. XR can provide elegant solutions to many large-scale ranking and recommendation applications including Dynamic Search Advertising (DSA). XR can learn more accurate models than the recently popular extreme classifiers which incorrectly assume strictly binary-valued label relevances. Traditional regression … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
15
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 17 publications
(15 citation statements)
references
References 38 publications
0
15
0
Order By: Relevance
“…In meta-learning domains, the characteristics of each problem instance are considered, and the output is an ordered list of algorithms according to their suitability to the given problem [6], [7]. Lastly, In text classification, a label ranking algorithm can be employed to output a ranked list of topics, tags or advertisements for a document or web page (the instance) [8], [9]. Due to this wide applicability, label ranking has recently attracted a lot of focus from the machine learning community [10]- [20].…”
Section: Introductionmentioning
confidence: 99%
“…In meta-learning domains, the characteristics of each problem instance are considered, and the output is an ordered list of algorithms according to their suitability to the given problem [6], [7]. Lastly, In text classification, a label ranking algorithm can be employed to output a ranked list of topics, tags or advertisements for a document or web page (the instance) [8], [9]. Due to this wide applicability, label ranking has recently attracted a lot of focus from the machine learning community [10]- [20].…”
Section: Introductionmentioning
confidence: 99%
“…(1) State of the art extreme classifiers such as AttentionXML [66], Astec [11], DiSMEC [2], Parabel [45] and Bonsai [26] (2) Extreme classifiers which improve performance on few-shot labels such as DECAF [40], XReg [46] and PFastreXML [20] (3) Dense retrieval methods based on the state of the art natural language modelling architectures such as Sentence BERT bi-encoder [48], Fasttext [24] and WarpLDA (topic model) [10], these algorithms provide strong scalable baseline to compare ZestXML's performance over zero-shot and few-shot labels (4) Leading zero-shot multi-label learners such as 0-BIGRU-WLAN, 0-CNN-LWAN [50] and CoNSE [43], these baselines don't scale on extreme datasets, hence, ZestXML's comparison against these baselines is reported only for EURLex-4.3K in Table ??. The implementation of all the aforementioned algorithms were provided by their authors.…”
Section: Experiments 51 Experiments Settingsmentioning
confidence: 99%
“…Additionally, they also tend to perform poorly on few-shot labels due to classifier over-fitting issues (see Section 5). Recently, several approaches have been proposed which aim to model the few-shot labels more accurately [40,46]; these, however, do not address the zero-shot prediction problem.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…For instance, the deep learning XMC model X-Transformer [7,52] achieves state-of-the-art performance on public academic benchmarks [3]. Partition-based methods such as Parabel [33] and XReg [34], as another example, finds successful applications to dynamic search advertising in Bing. In particular, tree-based partitioning XMC models are a staple of modern search engines and recommender systems due to their inference time being sub-linear (i.e., logarithmic) to the enormous output space (e.g., 100 million or more).…”
Section: Introductionmentioning
confidence: 99%