The Information Retrieval Series
DOI: 10.1007/0-306-47019-5_5
|View full text |Cite
|
Sign up to set email alerts
|

Distributed Information Retrieval

Abstract: A m ulti-database model of distributed information retrieval is presented, in which people are assumed to have access to many searchable text databases. In such a n e n vironment, full-text information retrieval consists of discovering database contents, ranking databases by their expected ability to satisfy the query, s e a r c hing a small number of databases, and merging results returned by di erent databases. This paper presents algorithms for each task. It also discusses how to reorganize conventional tes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
328
0
7

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 228 publications
(354 citation statements)
references
References 17 publications
1
328
0
7
Order By: Relevance
“…The top ranked results returned from the selected collections are merged into a single list. Current collection selection methods compare the query with the summary of each collection (term statistics [11] or sample documents [17,16]) and rank collections accordingly.…”
Section: Background and Related Workmentioning
confidence: 99%
“…The top ranked results returned from the selected collections are merged into a single list. Current collection selection methods compare the query with the summary of each collection (term statistics [11] or sample documents [17,16]) and rank collections accordingly.…”
Section: Background and Related Workmentioning
confidence: 99%
“…From the database point of view, a distributed information retrieval system could follow a single database model or a multi-database model [4]. In the single database model, the documents are copied to a centralized database, where they are indexed and made searchable.…”
Section: Introductionmentioning
confidence: 99%
“…We address a particular kind of entity search, namely the search over multiple data sources, called federated search, which entails the three main problems of source representation, source selection, and result merging [8,9]. We focus on the latter for federated entity search in uncooperative settings as illustrated in Figure 1b, where only ranked result lists of entity descriptions are obtained from each source and no further information about the sources is available.…”
Section: Overviewmentioning
confidence: 99%
“…This solution and the comprehensive work of the database community in this realm [5][6][7] assume full access to the entire datasets to compute features such as weights of attributes, co-occurences or to learn parameters, which are then used to resolve all coreferences between two or more datasets in one run. However, access to the entire datasets is either not granted in many application scenarios such as search over multiple Web data sources (where data access is only provided via APIs for single requests), also called federated search over uncooperative sources [8,9], or many data sources are highly dynamic, imposing a high burden on batch processing to keep up with frequent changes and to provide fresh information for time sensitive applications such as search over stock quotes, movies and timetables. Distributed document retrieval for uncooperative environments has been studied in the IR community [8,9].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation