Embedding learning on knowledge graphs (KGs) aims to encode all entities and relationships into a continuous vector space, which provides an effective and flexible method to implement downstream knowledge-driven artificial intelligence (AI) and natural language processing (NLP) tasks. Since KG construction usually involves automatic mechanisms with less human supervision, it inevitably brings in plenty of noises to KGs. However, most conventional KG embedding approaches inappropriately assume that all facts in existing KGs are completely correct and ignore noise issues, which brings about potentially serious errors. To address this issue, in this paper we propose a novel approach to learn embeddings with triple trustiness on KGs, which takes possible noises into consideration. Specifically, we calculate the trustiness value of triples according to the rich and relatively reliable information from large amounts of entity type instances and entity descriptions in KGs. In addition, we present a cross-entropy based loss function for model optimization. In experiments, we evaluate our models on KG noise detection, KG completion and classification. Through extensive experiments on three datasets, we demonstrate that our proposed model can learn better embeddings than all baselines on noisy KGs.
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval. Though previous studies have made substantial efforts for automated keyphrase extraction and generation, surprisingly, few studies have been made for keyphrase completion (KPC). KPC aims to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases, which can be applied to improve text indexing system, etc. In this paper, we propose a novel KPC method with an encoder-decoder framework. We name it deep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework. Specifically, the encoder and the decoder in DKPC play different roles to make full use of the known keyphrases. The former considers the keyphrase-guiding factors, which aggregates information of known keyphrases into context. On the contrary, the latter considers the keyphrase-inhibited factor to inhibit semantically repeated keyphrase generation. Extensive experiments on benchmark datasets demonstrate the efficacy of our proposed model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.