Entity disambiguation is an important step in many information retrieval applications. This paper proposes new research for entity disambiguation with the focus of name disambiguation in digital libraries. In particular, pairwise similarity is first learned for publications that share the same author name string (ANS) and then a novel Hierarchical Agglomerative Clustering approach with Adaptive Stopping Criterion (HACASC) is proposed to adaptively cluster a set of publications that share a same ANS to individual clusters of publications with different author identities. The HACASC approach utilizes a mixture of kernel ridge regressions to intelligently determine the threshold in clustering. This obtains more appropriate clustering granularity than non-adaptive stopping criterion. We conduct a large scale empirical study with a dataset of more than 2 million publication record pairs to demonstrate the advantage of the proposed HACASC approach.
Recent years have witnessed a rapid adoption of mobile devices and a dramatic proliferation of mobile applications (Apps for brevity). However, the large number of mobile Apps makes it difficult for users to locate relevant Apps. Therefore, recommending Apps becomes an urgent task. Traditional recommendation approaches focus on learning the interest of a user and the functionality of an item (e.g., an App) from a set of user-item ratings, and they recommend an item to a user if the item's functionality well matches the user's interest. However, Apps could have privileges to access a user's sensitive resources (e.g., contact, message, and location). As a result, a user chooses an App not only because of its functionality, but also because it respects the user's privacy preference.To the best of our knowledge, this paper presents the first systematic study on incorporating both interest-functionality interactions and users' privacy preferences to perform personalized App recommendations. Specifically, we first construct a new model to capture the trade-off between functionality and user privacy preference. Then we crawled a real-world dataset (16, 344 users, 6, 157 Apps, and 263, 054 ratings) from Google Play and use it to comprehensively evaluate our model and previous methods. We find that our method consistently and substantially outperforms the state-of-the-art approaches, which implies the importance of user privacy preference on personalized App recommendations. Moreover, we explore the impact of different levels of privacy information on the performances of our method, which gives us insights on what resources are more likely to be treated as private by users and influence users' behaviors at selecting Apps.
Along with the exponential growth on markets of mobile Applications (apps), comes the serious public concern about the security and privacy issues. Therefore automatic app risk assessment becomes increasingly important to support users with useful evidences for their decisions. User comment provides a unique perspective from actual user experience, and should be considered valuable information source for risk assessment for mobile apps. In this paper, we provide a novel perspective to view the risk assessment of an app from its user comments as a crowdsourcing problem and adopt ranking model as the evaluation method. We develop a co-training scheme to amalgamate feature learning and learning to rank models. Experiments conducted on two different real-world datasets show substantial performance improvements (i.e., 6%-7%) over the state-ofthe-art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.