Besides the ID class features, the advertisement click log file contains many significant features, which make the study of the advertisement clickthrough rate prediction more difficult. In this study, we convert original features into numerical meaningful ones, which reduce the sparsity and redundancy. In order to solve the problem of class imbalance, we propose a downsampling algorithm based on K-means model to classify large samples, then divide them into some sensible and rational features by the heuristic methods. To further improve the feature representation, we finally select and combine features by the Gradient Boosting Decision Tree model and process high-dimensional features by the logistic regression method. We conducted experiments on the dataset of Tencent SOSO and demonstrated that our approach outperforms the state-of-the-art baseline methods by 0.05% on average in terms of R2 and by 50.5% on average in terms of RMSE.
Search engines and recommendation systems are an essential means of solving information overload, and recommendation algorithms are the core of recommendation systems. Recently, the recommendation algorithm of graph neural network based on social network has greatly improved the quality of the recommendation system. However, these methods paid far too little attention to the heterogeneity of social networks. Indeed, ignoring the heterogeneity of connections between users and interactions between users and items may seriously affect user representation. In this paper, we propose a hierarchical attention recommendation system (HA-RS) based on mask social network, combining social network information and user behavior information, which improves not only the accuracy of recommendation but also the flexibility of the network. First, learning the node representation in the item domain through the proposed Context-NE model and then the feature information of neighbor nodes in social domain is aggregated through the hierarchical attention network. It can fuse the information in the heterogeneous network (social domain and item domain) through the above two steps. We propose the mask mechanism to solve the cold-start issues for users and items by randomly masking some nodes in the item domain and in the social domain during the training process. Comprehensive experiments on four real-world datasets show the effectiveness of the proposed method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.