No abstract
Enterprise Search (ES) is different from traditional IR due to a number of reasons, among which the high level of ambiguity of terms in queries and documents and existence of graph-structured enterprise data (ontologies) that describe the concepts of interest and their relationships to each other, are the most important ones.Our method identifies concepts from the enterprise ontology in the query and corpus. We propose a ranking scheme for ontology sub-graphs on top of approximately matched token q-grams. The ranking leverages the graph-structure of the ontology to incorporate not explicitly mentioned concepts. It improves previous solutions by using a fine-grained ranking function that is specifically designed to cope with high levels of ambiguity. This method is able to capture much more of the semantics of queries and documents than previous techniques. We prove this claim by an evaluation of our method in three real-life scenarios from two different domains, and found it to consistently be superior both in terms of precision and recall.
This chapter argues that the notion of identity of and reference to entities (objects, individuals, instances) is fundamental in order to achieve semantic interoperability and integration between different sources of knowledge. The first step in order to integrate different information sources about an entity is to recognize that those sources describe the same entity. Unfortunately, different systems that manage information about entities commonly issue different identifiers for these entities. This makes reference to entities across information systems very complicated or impossible, because there are no means to know how an entity is identified in another system. The authors propose a global, public infrastructure, the Entity Name System (ENS), which enables the creation and re-use of identifiers for entities. This a-priori approach enables systems to reference entities with a globally unique identifier, and makes semantic integration a much easier job. The authors illustrate two enterprise use cases which build on this approach: entity-centric publishing, and entity-centric corporate information management, currently being developed by two leading companies in their respective fields.
Successfully structuring information in databases, OLAP cubes, and XML is a crucial element in managing data nowadays. However this process brought new challenges to usability. It is difficult for users to switch from common communication means using natural language to data models (e.g., database schemas) that are hard to work with and understand, especially for occasional users. This important issue is under intense scrutiny in the database community (e.g., keyword search over databases and query relaxation techniques), and the information extraction community (e.g., linking structured and unstructured data). However, there is still no comprehensive solution that automatically generates an OLAP (Online Analytical Processing) query and chooses a visualization based on textual content with high precision. We present such a method. We discuss how to dynamically generate interpretations of a textual content as an OLAP query, select the best visualization, and retrieve on the fly corresponding data from a data warehouse. To provide the most relevant aggregation results, we consider the user's actual context, described by a document's content. Moreover we provide a prototypical implementation of our method, the Text-To-Query system (T2Q) and show how T2Q can be successfully applied to an enterprise scenario as an extension for an office application.Our revenue is decreasing in some countries. The relative importance of each resort to the revenue is satisfying.French Riviera is doing very good.
Developer communities built around software products, like the SAP Community Network, provide a knowledge base for reocurring problems and their solutions. Due to the large amount of content maintained in such communities, e.g., in forums, finding relevant solutions is a major challenge beyond the scope of common keyword-based search engines. In fact, it is measured that around 50% of the forum questions of our particular scenario have already been answered at the time they are posted. We target this challenge by an entity aware search, which exploits structured knowledge, such as domain-specific ontologies, for both query interpretation and creation of document indexes. The system takes a natural language query as input, interprets it as an entity graph, matches this graph with pre-processed content and supports the user in refining his query based on the top-k relevant entities. Results are presented in a user interface that supports faceted search based on entities. Additionally, the user interface is structured according to possible search intentions of users. The evaluation of our system on the SCN scenario yields that the top 5 entities in user queries are recognized with a precision of 83% compared to 61% of state of the art algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.