Abstract:Databases of text and text-annotated data constitute a significant fraction of the information available in electronic form. Searching and browsing are the typical ways that users locate items of interest in such databases. Faceted interfaces represent a new powerful paradigm that proved to be a successful complement to keyword searching. Thus far, the identification of the facets was either a manual procedure, or relied on apriori knowledge of the facets that can potentially appear in the underlying collectio… Show more
“…Most existing faceted search and facets generation systems are built on a specific domain such as product search or predefined facet categories. For example, Dakka and Ipeirotis [2] introduced an unsupervised technique for automatic extraction of facets that are useful for browsing text databases. Facet hierarchies are generated for a whole collection, instead of for a given query.Facets of a query are automatically mined from the top web search results of the query without any additional domain knowledge required.…”
Section: Query Facets Mining and Faceted Searchmentioning
Abstract:There is a problem of finding query facets or angles that is a particular aspect or feature of something which are multiple groups of words or phrases that explain and summarize th content covered by a query. Query facets provide interesting and useful knowledge about a query and thus can be used to improve search experiences in many ways. First, we can display query facets together with the original search results in an appropriate way. Second, query facets may provide direct information or instant answers that users are seeking. Third, query facets may also be used to improve the diversity of the ten blue links. It is an assumption that the important aspects of a query are usually presented and repeated in the query's top retrieved documents in the style of lists, and query facets can be mined out by aggregating these significant lists. This can be solved by automatically mine query facets by extracting and grouping frequent lists from free text, HTML tags, and repeat regions within top search results. Then a large number of lists do exist and useful query angles can be mined.
“…Most existing faceted search and facets generation systems are built on a specific domain such as product search or predefined facet categories. For example, Dakka and Ipeirotis [2] introduced an unsupervised technique for automatic extraction of facets that are useful for browsing text databases. Facet hierarchies are generated for a whole collection, instead of for a given query.Facets of a query are automatically mined from the top web search results of the query without any additional domain knowledge required.…”
Section: Query Facets Mining and Faceted Searchmentioning
Abstract:There is a problem of finding query facets or angles that is a particular aspect or feature of something which are multiple groups of words or phrases that explain and summarize th content covered by a query. Query facets provide interesting and useful knowledge about a query and thus can be used to improve search experiences in many ways. First, we can display query facets together with the original search results in an appropriate way. Second, query facets may provide direct information or instant answers that users are seeking. Third, query facets may also be used to improve the diversity of the ten blue links. It is an assumption that the important aspects of a query are usually presented and repeated in the query's top retrieved documents in the style of lists, and query facets can be mined out by aggregating these significant lists. This can be solved by automatically mine query facets by extracting and grouping frequent lists from free text, HTML tags, and repeat regions within top search results. Then a large number of lists do exist and useful query angles can be mined.
“…An example of the idea assuming only one facet, is shown in Figure 1. Figure 1(a) shows a taxonomy and 8 indexed objects (1)(2)(3)(4)(5)(6)(7)(8). The user explores or navigates the information space by setting and changing his focus.…”
Section: Requirements and Backgroundmentioning
confidence: 99%
“…Clustering the snippets rather than the whole documents makes clustering algorithms faster. Some clustering algorithms [6,5,23] use internal or external sources of knowledge like Web directories (e.g. DMoz 3 ), Web dictionaries (e.g.…”
Abstract. This paper proposes exploiting both explicit and mined metadata for enriching Web searching with exploration services. On-line results clustering is useful for providing users with overviews of the results and thus allowing them to restrict their focus to the desired parts. On the other hand, the various metadata that are available to a WSE (Web Search Engine), e.g. domain/language/date/filetype, are commonly exploited only through the advanced (form-based) search facilities that some WSEs offer (and users rarely use). We propose an approach that combines both kinds of metadata by adopting the interaction paradigm of dynamic taxonomies and faceted exploration. This combination results to an effective, flexible and efficient exploration experience.
“…It extracts salient phrases as candidate cluster names from the list of titles and snippets of the answer, and ranks them using a regression model over five different properties, learned from human training data. Another approach that uses several external resources, such as WordNet and Wikipedia, in order to identify useful terms and to organize them hierarchically is described in [4].…”
Abstract. Results clustering in Web Searching is useful for providing users with overviews of the results and thus allowing them to restrict their focus to the desired parts. However, the task of deriving singleword or multiple-word names for the clusters (usually referred as cluster labeling) is difficult, because they have to be syntactically correct and predictive. Moreover efficiency is an important requirement since results clustering is an online task. Suffix Tree Clustering (STC) is a clustering technique where search results (mainly snippets) can be clustered fast (in linear time), incrementally, and each cluster is labeled with a phrase. In this paper we introduce: (a) a variation of the STC, called STC+, with a scoring formula that favors phrases that occur in document titles and differs in the way base clusters are merged, and (b) a novel algorithm called NM-STC that results in hierarchically organized clusters. The comparative user evaluation showed that both STC+ and NM-STC are significantly more preferred than STC, and that NM-STC is about two times faster than STC and STC+.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.