Click-through rate prediction is an essential task in industrial applications, such as online advertising. Recently deep learning based models have been proposed, which follow a similar Embed-ding&MLP paradigm. In these methods large scale sparse input features are first mapped into low dimensional embedding vectors, and then transformed into fixed-length vectors in a group-wise manner, finally concatenated together to fed into a multilayer perceptron (MLP) to learn the nonlinear relations among features. In this way, user features are compressed into a fixed-length representation vector, in regardless of what candidate ads are. The use of fixed-length vector will be a bottleneck, which brings difficulty for Embedding&MLP methods to capture user's diverse interests effectively from rich historical behaviors. In this paper, we propose a novel model: Deep Interest Network (DIN) which tackles this challenge by designing a local activation unit to adaptively learn the representation of user interests from historical behaviors with respect to a certain ad. This representation vector varies over different ads, improving the expressive ability of model greatly. Besides, we develop two techniques: mini-batch aware regularization and data adaptive activation function which can help training industrial deep networks with hundreds of millions of parameters. Experiments on two public datasets as well as an Alibaba real production dataset with over 2 billion samples demonstrate the effectiveness of proposed approaches, which achieve superior performance compared with state-of-the-art methods. DIN now has been successfully deployed in the online display advertising system in Alibaba, serving the main traffic.
Circular RNAs (circRNAs) are an abundant class of endogenous non-coding RNAs and are associated with numerous diseases, including cancer, cardiovascular diseases, and type 2 diabetes mellitus (T2DM). However, the association between circRNAs and inflammation or inflammatory cytokines in patients with T2DM remains to be fully elucidated. The purpose of the present study was to investigate the expression profiles of circRNAs in peripheral leucocytes of patients with T2DM and their association with inflammatory cytokines. Peripheral blood from patients with T2DM (n=43) and healthy individuals (n=45) were collected for RNA sequencing and later verification. Reverse transcription-polymerase chain reaction (RT-PCR) and reverse transcription-quantitative polymerase chain reaction (RT-qPCR) analyses were used to detect the expression levels of circRNAs. The expression of inflammatory factors, including interleukin (IL)-1, (IL)-6, and tumor necrosis factor (TNF)-α were measured via enzyme-linked immunosorbent assay. Furthermore, the mRNA expression level of ankyrin repeat domain 36 (ANKRD36), a protein located at 2q11.2 that interacts with the GAPDH gene, was measured using RT-qPCR analysis. The circRNA/microRNA (miRNA) interaction was predicted using RegRNA and mirPath software. In total, 220 circRNAs were found to be differentially expressed between patients with T2DM and healthy individuals, of which 107 were upregulated and 113 were downregulated. Among the nine selected circRNAs, circANKRD36 was significantly upregulated in patients with T2DM compared with control subjects (P=0.02). The expression level of circANKRD36 was positively correlated with glucose and glycosylated hemoglobin (r=0.3250, P=0.0047 and r=0.3171, P=0.0056, respectively). The expression level of IL-6 was significantly different between the T2DM group and control group (P=0.028) and was positively correlated with circANKRD36. The difference of circANKRD36 host gene expression between patients with T2DM and healthy controls was significant (P=0.04). Taken together, circANKRD36 may be involved in T2DM and inflammation-associated pathways via interaction with miRNAs, including hsa-miR-3614-3p, hsa-miR-498, and hsa-miR-501-5p. The expression of circANKRD36 was up regulated in peripheral blood leucocytes and was correlated with chronic inflammation in T2DM. Therefore, circANKRD36 can be used as a potential biomarker for screening chronic inflammation in patients with T2DM.
Taobao, as the largest online retail platform in the world, provides billions of online display advertising impressions for millions of advertisers every day. For commercial purposes, the advertisers bid for specific spots and target crowds to compete for business traffic. The platform chooses the most suitable ads to display in tens of milliseconds. Common pricing methods include cost per mille (CPM) and cost per click (CPC). Traditional advertising systems target certain traits of users and ad placements with fixed bids, essentially regarded as coarse-grained matching of bid and traffic quality. However, the fixed bids set by the advertisers competing for different quality requests cannot fully optimize the advertisers' key requirements. Moreover, the platform has to be responsible for the business revenue and user experience. Thus, we proposed a bid optimizing strategy called optimized cost per click (OCPC) which automatically adjusts the bid to achieve finer matching of bid and traffic quality of page view (PV) request granularity. Our approach optimizes advertisers' demands, platform business revenue and user experience and as a whole improves traffic allocation efficiency. We have validated our approach in Taobao display advertising system in production. The online A/B test shows our algorithm yields substantially better results than previous fixed bid manner.
Recent progress on automatic generation of image captions has shown that it is possible to describe the most salient information conveyed by images with accurate and meaningful sentences. In this paper, we propose an image captioning system that exploits the parallel structures between images and sentences. In our model, the process of generating the next word, given the previously generated ones, is aligned with the visual perception experience where the attention shifts among the visual regions-such transitions impose a thread of ordering in visual perception. This alignment characterizes the flow of latent meaning, which encodes what is semantically shared by both the visual scene and the text description. Our system also makes another novel modeling contribution by introducing scene-specific contexts that capture higher-level semantic information encoded in an image. The contexts adapt language models for word generation to specific scene types. We benchmark our system and contrast to published results on several popular datasets, using both automatic evaluation metrics and human evaluation. We show that either region-based attention or scene-specific contexts improves systems without those components. Furthermore, combining these two modeling ingredients attains the state-of-the-art performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.