For better user experience and business effectiveness, Click-Through Rate (CTR) prediction has been one of the most important tasks in E-commerce. Although extensive CTR prediction models have been proposed, learning good representation of items from multimodal features is still less investigated, considering an item in E-commerce usually contains multiple heterogeneous modalities. Previous works either concatenate the multiple modality features, that is equivalent to giving a fixed importance weight to each modality; or learn dynamic weights of different modalities for different items through technique like attention mechanism. However, a problem is that there usually exists common redundant information across multiple modalities. The dynamic weights of different modalities computed by using the redundant information may not correctly reflect the different importance of each modality. To address this, we explore the complementarity and redundancy of modalities by considering modality-specific and modality-invariant features differently. We propose a novel Multimodal Adversarial Representation Network (MARN) for the CTR prediction task. A multimodal attention network first calculates the weights of multiple modalities for each item according to its modality-specific features. Then a multimodal adversarial network learns modalityinvariant representations where a double-discriminators strategy is introduced. Finally, we achieve the multimodal item representations by combining both modality-specific and modality-invariant representations. We conduct extensive experiments on both public and industrial datasets, and the proposed method consistently achieves remarkable improvements to the state-of-the-art methods. Moreover, the approach has been deployed in an operational E-commerce system and online A/B testing further demonstrates the effectiveness.
In real-world search, recommendation, and advertising systems, the multi-stage ranking architecture is commonly adopted. Such architecture usually consists of matching, pre-ranking, ranking, and re-ranking stages. In the pre-ranking stage, vector-product based models with representation-focused architecture are commonly adopted to account for system efficiency. However, it brings a significant loss to the effectiveness of the system. In this paper, a novel pre-ranking approach is proposed which supports complicated models with interaction-focused architecture. It achieves a better tradeoff between effectiveness and efficiency by utilizing the proposed learnable Feature Selection method based on feature Complexity and variational Dropout (FSCD). Evaluations in a realworld e-commerce sponsored search system for a search engine demonstrate that utilizing the proposed pre-ranking, the effectiveness of the system is significantly improved. Moreover, compared to the systems with conventional pre-ranking models, an identical amount of computational resource is consumed. CCS CONCEPTS• Information systems → Learning to rank.
Cross features play an important role in click-through rate (CTR) prediction. Most of the existing methods adopt a DNN-based model to capture the cross features in an implicit manner. These implicit methods may lead to a sub-optimized performance due to the limitation in explicit semantic modeling. Although traditional statistical explicit semantic cross features can address the problem in these implicit methods, such features still suffer from some challenges, including lack of generalization and expensive memory cost. Few works focus on tackling these challenges. In this paper, we take the first step in learning the explicit semantic cross features and propose Pre-trained Cross Feature learning Graph Neural Networks (PCF-GNN), a GNN based pre-trained model aiming at generating cross features in an explicit fashion. Extensive experiments are conducted on both public and industrial datasets, where PCF-GNN shows competence in both performance and memory-efficiency in various tasks. CCS CONCEPTS• Information systems → Recommender systems.
Advertisers play an essential role in many e-commerce platforms like Taobao and Amazon. Fulfilling their marketing needs and supporting their business growth is critical to the long-term prosperity of platform economies. However, compared with extensive studies on user modeling such as click-through rate predictions, much less attention has been drawn to advertisers, especially in terms of understanding their diverse demands and performance. Different from user modeling, advertiser modeling generally involves many kinds of tasks (e.g. predictions of advertisers' expenditure, active-rate, or total impressions of promoted products). In addition, major e-commerce platforms often provide multiple marketing scenarios (e.g. Sponsored Search, Display Ads, Live Streaming Ads) while advertisers' behavior tend to be dispersed among many of them. This raises the necessity of multi-task and multi-scenario consideration in comprehensive advertiser modeling, which faces the following challenges: First, one model per scenario or per task simply doesn't scale; Second, it is particularly hard to model new or minor scenarios with limited data samples; Third, inter-scenario correlations are complicated, and may vary given different tasks.To tackle these challenges, we propose a multi-scenario multitask meta learning approach (M2M) which simultaneously predicts multiple tasks in multiple advertising scenarios. Specifically, we introduce a novel meta unit that incorporates rich scenario knowledge to learn explicit inter-scenario correlations and can easily scale to new scenarios. Furthermore, we present a meta attention module to capture diverse inter-scenario correlations given different tasks, and a meta tower module to enhance the capability of capturing the representation of scenario-specific features. Compelling results from both offline evaluation and online A/B tests demonstrate the superiority of M2M over state-of-the-art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.