Instance matching, which aims at discovering the correspondences of instances between knowledge bases, is a fundamental issue for the ontological data sharing and integration in Semantic Web. Although considerable instance matching approaches have already been proposed, how to ensure both high accuracy and efficiency is still a big challenge when dealing with large-scale knowledge bases. This paper proposes an iterative framework, RiMOM-IM (RiMOM-Instance Matching). The key idea behind this framework is to fully utilize the distinctive and available matching information to improve the efficiency and control the error propagation. We participated in the 2013 and 2014 competition of Ontology Alignment Evaluation Initiative (OAEI), and our system was ranked the first. Furthermore, the experiments on previous OAEI datasets also show that our system performs the best.
With the information explosion of news articles, personalized news recommendation has become important for users to quickly find news that they are interested in. Existing methods on news recommendation mainly include collaborative filtering methods which rely on direct user-item interactions and content based methods which characterize the content of user reading history. Although these methods have achieved good performances, they still suffer from data sparse problem, since most of them fail to extensively exploit high-order structure information in news recommendation systems. In this paper, we propose to build a heterogeneous graph to explicitly model the interactions among users, news and latent topics. The incorporated topic information would help indicate a user's interest and alleviate the sparsity of user-item interactions. Then we take advantage of graph neural networks to learn user and news representations that encode high-order structure information by propagating embeddings over the graph. The learned user embeddings with complete historic user clicks capture the users' long-term interests. We also consider a user's short-term interest using the recent reading history with an attention based LSTM model. Experimental results on real-world datasets show that our proposed model significantly outperforms state-of-the-art methods on news recommendation.
Dirichlet process mixture model (DPM-M) has great potential for detecting the underlying structure of data. Extensive studies have applied it for text clustering in terms of topics. However, due to the unsupervised nature, the topic clusters are always less satisfactory. Considering that people often have some prior knowledge about which potential topics should exist in given data, we aim to incorporate such knowledge into the DPMM to improve text clustering. We propose a novel model TSDPMM based on a new seeded Pólya urn scheme. Experimental results on document clustering across three datasets demonstrate our proposed TSDPMM significantly outperforms stateof-the-art DPMM model and can be applied in a lifelong learning framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.