This study investigates the accuracy of search engine hit counts for search queries. We investigate the accuracy of hit counts for Google, Yahoo and Microsoft Live Search, and the accuracy of single and multiple term queries. In addition, we investigate the consistency of hit count estimates for 15 days. The results show that all three provide estimates for the number of matching documents and the estimation patterns of their counting algorithms differ greatly. The accuracy of hit counts for multiple word queries has not been studied before. The results of our study show that the number of words in queries affects the accuracy of estimations significantly. The percentages of accurate hit count estimations are reduced almost by half when going from single word to two word query tests in all three search engines. With the increase in the number of query words, the error in estimation increases and the number of accurate estimations decreases.
We describe an information system architecture for the ACES (Asia-Pacific Cooperation for Earthquake Simulation) community. It addresses several key features of the fieldsimulations at multiple scales that need to be coupled together; real-time and archival observational data, which needs to be analyzed for patterns and linked to the simulations; a variety of important algorithms including partial differential equation solvers, particle dynamics, signal processing and data analysis; a natural three dimensional space (plus time) setting for both visualization and observations; the linkage of field to real-time events both as an aid to crisis management and to scientific discovery. We also address the need to support education and research for a field whose computational sophistication is increasing rapidly and spans a broad range. The information system assumes that all significant data is defined by an XML layer which could be virtual but whose existence ensures that all data is object-based and can be accessed and searched in this form. The various capabilities needed by ACES are defined as Grid Services, which are conformant with emerging standards and implemented with different levels of fidelity and performance appropriate for the application. Grid Services can be composed in a hierarchical fashion to address complex problems. The real-time needs of the field are addressed by high performance implementation of data transfer and simulation services; further the environment is linked to real-time collaboration to support interactions between scientists in geographically distant locations. ACES Grid and .opennet Grid ArchitectureWe consider an ACES [1] computational environment (ACESCE) built in terms of a web-based user interfaces accessing services, which are built in a broker-based fashion [2]. The client machine contacts a server that acts as an intermediary to back-end resources and also as a conduit for clients to access services. One can also view the brokers as middleware wrappers that allow a heterogeneous collection of resources to be accessed in a relatively uniform fashion. In the simplest technology, these brokers or wrappers would be implemented as a Perl CGI program running on a web server. As discussed later, there are more sophisticated approaches but the basic model is correct;
Purpose – The purpose of this paper is to better understand three main aspects of semantic web search engines of Google Knowledge Graph and Bing Satori. The authors investigated: coverage of entity types, the extent of their support for list search services and the capabilities of their natural language query interfaces. Design/methodology/approach – The authors manually submitted selected queries to these two semantic web search engines and evaluated the returned results. To test the coverage of entity types, the authors selected the entity types from Freebase database. To test the capabilities of natural language query interfaces, the authors used a manually developed query data set about US geography. Findings – The results indicate that both semantic search engines cover only the very common entity types. In addition, the list search service is provided for a small percentage of entity types. Moreover, both search engines support queries with very limited complexity and with limited set of recognised terms. Research limitations/implications – Both companies are continually working to improve their semantic web search engines. Therefore, the findings show their capabilities at the time of conducting this research. Practical implications – The results show that in the near future the authors can expect both semantic search engines to expand their entity databases and improve their natural language interfaces. Originality/value – As far as the authors know, this is the first study evaluating any aspect of newly developing semantic web search engines. It shows the current capabilities and limitations of these semantic web search engines. It provides directions to researchers by pointing out the main problems for semantic web search engines.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.