1997
DOI: 10.1002/(sici)1097-4571(199702)48:2<122::aid-asi3>3.0.co;2-#
|View full text |Cite
|
Sign up to set email alerts
|

Integrating structured data and text: A relational approach

Abstract: We integrate structured data and text using the unchanged, standard relational model. We started with the premise that a relational system could be used to implement an information retrieval (IR) system. After implementing a prototype to verify that premise, we then began to investigate the performance of a parallel relational database system for this application. We also tested the effect of query reduction on accuracy and found that queries can be reduced prior to their implementation without incurring a sig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
33
0
4

Year Published

1999
1999
2008
2008

Publication Types

Select...
6
4

Relationship

4
6

Authors

Journals

citations
Cited by 60 publications
(37 citation statements)
references
References 12 publications
0
33
0
4
Order By: Relevance
“…The idea to use DBMS technology as a building block in an IR system is pursued e.g., in [21], where the authors store inverted lists in a Microsoft SQLServer and use SQL queries for keyword search. Similarly, in [19] IR data is distributed over a PC cluster, and an analysis of the impact of concurrent updates is provided.…”
Section: Related Workmentioning
confidence: 99%
“…The idea to use DBMS technology as a building block in an IR system is pursued e.g., in [21], where the authors store inverted lists in a Microsoft SQLServer and use SQL queries for keyword search. Similarly, in [19] IR data is distributed over a PC cluster, and an analysis of the impact of concurrent updates is provided.…”
Section: Related Workmentioning
confidence: 99%
“…Grossman et al [13] present techniques for representing text doc- uments and their associated term frequencies in relational tables, as well as for mapping boolean and vector-space queries into standard SQL queries. They also use a query-pruning technique, based on word frequencies, to speed up query execution.…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, each tuple in RiWeights consists of a tuple id tid, the actual token (i.e., q-gram in this case), and its associated weight. Then, if C bytes are needed to represent tid and weight, the total size of relation RiWeights will not exceed Given the relations R1Weights and R2Weights, a baseline approach [13,18] to compute R1 I φ R2 is shown in Figure 2. This SQL statement performs the text join by computing the similarity of each pair of tuples and filtering out any pair with similarity less than the similarity threshold φ.…”
Section: Tuple Weight Vectorsmentioning
confidence: 99%