Scaled acquisition of high-quality structured knowledge has been a longstanding goal of Artificial Intelligence research. Recent advances in crowdsourcing, the sheer number of Internet and mobile users, and the commercial availability of supporting platforms offer new tools for knowledge acquisition. This article applies context-aware knowledge acquisition that simultaneously satisfies users’ immediate information needs while extending its own knowledge using crowdsourcing. The focus is on knowledge acquisition on a mobile device, which makes the approach practical and scalable; in this context, we propose and implement a new KA approach that exploits an existing knowledge base to drive the KA process, communicate with the right people, and check for consistency of the user-provided answers. We tested the viability of the approach in experiments using our platform with real users around the world, and an existing large source of common-sense background knowledge. These experiments show that the approach is promising: the knowledge is estimated to be true and useful for users 95% of the time. Using context to proactively drive knowledge acquisition increased engagement and effectiveness (the number of new assertions/day/user increased for 175%). Using pre-existing and newly acquired knowledge also proved beneficial.
Natural Language Inference is an important task for Natural Language Understanding. It is concerned with classifying the logical relation between two sentences. In this paper, we propose several text generative neural networks for generating text hypothesis, which allows construction of new Natural Language Inference datasets. To evaluate the models, we propose a new metric -the accuracy of the classifier trained on the generated dataset. The accuracy obtained by our best generative model is only 2.7% lower than the accuracy of the classifier trained on the original, human crafted dataset. Furthermore, the best generated dataset combined with the original dataset achieves the highest accuracy. The best model learns a mapping embedding for each training example. By comparing various metrics we show that datasets that obtain higher ROUGE or METEOR scores do not necessarily yield higher classification accuracies. We also provide analysis of what are the characteristics of a good dataset including the distinguishability of the generated datasets from the original one.
Abstract. Semantic parsing methods are used for capturing and representing semantic meaning of text. Meaning representation capturing all the concepts in the text may not always be available or may not be sufficiently complete. Ontologies provide a structured and reasoning-capable way to model the content of a collection of texts. In this work, we present a novel approach to joint learning of ontology and semantic parser from text. The method is based on semi-automatic induction of a context-free grammar from semantically annotated text. The grammar parses the text into semantic trees. Both, the grammar and the semantic trees are used to learn the ontology on several levels -classes, instances, taxonomic and non-taxonomic relations. The approach was evaluated on the first sentences of Wikipedia pages describing people.
This paper shows the implementation and evaluation of the Entity Linking or Named Entity Disambiguation system used and developed at Bloomberg. In particular, we present and evaluate a methodology and a system that do not require the use of Wikipedia as a knowledge base or training corpus. We present how we built features for disambiguation algorithms from the Bloomberg News corpus, and how we employed them for both single-entity and joint-entity disambiguation into a Bloomberg proprietary knowledge base of people and companies. Experimental results show high quality in the disambiguation of the available annotated corpus.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.