Over the last decade, the renaissance of Web technologies has transformed the online world into an application (App) driven society. While the abundant Apps have provided great convenience, their sheer number also leads to severe information overload, making it difficult for users to identify desired Apps. To alleviate the information overloading issue, recommender systems have been proposed and deployed for the App domain. However, existing work on App recommendation has largely focused on one single platform (e.g., smartphones), while it ignores the rich data of other relevant platforms (e.g., tablets and computers).
In this article, we tackle the problem of cross-platform App recommendation, aiming at leveraging users’ and Apps’ data on multiple platforms to enhance the recommendation accuracy. The key advantage of our proposal is that by leveraging multiplatform data, the perpetual issues in personalized recommender systems—data sparsity and cold-start—can be largely alleviated. To this end, we propose a hybrid solution, STAR (short for “croSs-plaTform App Recommendation”) that integrates both numerical ratings and textual content from multiple platforms. In STAR, we innovatively represent an App as an aggregation of
common features
across platforms (e.g., App’s functionalities) and
specific features
that are dependent on the resided platform. In light of this, STAR can discriminate a user’s preference on an App by separating the user’s interest into two parts (either in the App’s inherent factors or platform-aware features). To evaluate our proposal, we construct two real-world datasets that are crawled from the App stores of iPhone, iPad, and iMac. Through extensive experiments, we show that our STAR method consistently outperforms highly competitive recommendation methods, justifying the rationality of our cross-platform App recommendation proposal and the effectiveness of our solution.
Image-text matching is a vital yet challenging task in the field of multimedia analysis. Over the past decades, great efforts have been made to bridge the semantic gap between the visual and textual modalities. Despite the significance and value, most prior work is still confronted with a multi-view description challenge, i.e., how to align an image to multiple textual descriptions with semantic diversity. Toward this end, we present a novel context-aware multiview summarization network to summarize context-enhanced visual region information from multiple views. To be more specific, we design an adaptive gating self-attention module to extract representations of visual regions and words. By controlling the internal information flow, we are able to adaptively capture context information. Afterwards, we introduce a summarization module with a diversity regularization to aggregate region-level features into image-level ones from different perspectives. Ultimately, we devise a multi-view matching scheme to match multi-view image features with corresponding text ones. To justify our work, we have conducted extensive experiments on two benchmark datasets, i.e., Flickr30K and MS-COCO, which demonstrates the superiority of our model as compared to several state-of-the-art baselines. CCS CONCEPTS • Information systems → Multimedia and multimodal retrieval; Novelty in information retrieval.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.