Abstract. The Semantic Web should enhance the current World Wide Web with reasoning capabilities for enabling automated processing of possibly distributed information. In this paper we describe an architecture for Semantic Web reasoning and query answering in a very general setting involving several heterogeneous information sources, as well as domain ontologies needed for offering a uniform and source-independent view on the data. Since querying a Web source is very costly in terms of response time, we focus mainly on the query planner of such a system, as it may allow avoiding the access to queryirrelevant sources or combinations of sources based on knowledge about the domain and the sources.Taking advantage of the huge amount of knowledge implicit and distributed on the Web is a significant challenge. The main obstacle is due to the fact that most Web pages were designed for human-centred browsing rather than being machineprocessable. In addition to static HTML pages the Web currently offers online access to a large number information resources, such as databases with a Web interface. But real-life applications frequently require combining the information from several such resources, which may not have been developed with this interoperability requirement in mind. Thus, a large amount of knowledge is implicit, heterogeneously distributed among various resources and thus hard to process automatically.The recent developments towards a "Semantic Web" should help address these problems. Being able to explicitly represent domain-specific knowledge in the form of ontologies, should allow reasoning about such machine-processable Web pages.The emergence of standards for data markup and interchange such as XML and for representing information about resources and their semantics (such as RDF and RDF Schema) can be seen as a first step in the transition towards a Semantic Web. However, the vast majority of Web pages still conform to the HTML standard, which only controls their visual aspects rather than their informational content. Extracting the informational content from such pages which essentially contain free text is a difficult practical problem. The Resource Description Framework (RDF) has been designed to complement such human-oriented text with machine-processable annotations. A large number of prototype systems able to read and reason about such annotations have been developed (TRIPLE [7], Metalog [20], SiLRI [8], Ontobroker [9]). However, currently only a very small minority of Web pages have RDF annotations. Moreover, existing annotations tend to refer to basic features such as document author, creation date, etc., but do not duplicate the information content of the page.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.