2013
DOI: 10.14778/2536360.2536368
|View full text |Cite
|
Sign up to set email alerts
|

Making queries tractable on big data with preprocessing

Abstract: A query class is traditionally considered tractable if there exists a polynomial-time (PTIME) algorithm to answer its queries. When it comes to big data, however, PTIME algorithms often become infeasible in practice. A traditional and effective approach to coping with this is to preprocess data off-line, so that queries in the class can be subsequently evaluated on the data efficiently. This paper aims to provide a formal foundation for this approach in terms of computational complexity. (1) We propose a set o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
54
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
4
3
2

Relationship

4
5

Authors

Journals

citations
Cited by 36 publications
(54 citation statements)
references
References 33 publications
0
54
0
Order By: Relevance
“…For instance, when Q is in CQ and constraints in Σ are expressed in CQ, RCDP is NEXPTIME-complete, while QDSI is Σ p 3 -complete. There has also been recent work on querying big data, e.g., on the communication complexity of parallel query evaluation [17,18], the complexity of query processing in terms of MapReduce rounds [2,30], and the study of query classes that are tractable on big data [13]. In contrast, this work studies whether it is feasible to compute query answers in big data by accessing a small subset of the data, and if so, how to efficiently identify this subset.…”
Section: Sufficient Conditions For Scale Independencementioning
confidence: 99%
“…For instance, when Q is in CQ and constraints in Σ are expressed in CQ, RCDP is NEXPTIME-complete, while QDSI is Σ p 3 -complete. There has also been recent work on querying big data, e.g., on the communication complexity of parallel query evaluation [17,18], the complexity of query processing in terms of MapReduce rounds [2,30], and the study of query classes that are tractable on big data [13]. In contrast, this work studies whether it is feasible to compute query answers in big data by accessing a small subset of the data, and if so, how to efficiently identify this subset.…”
Section: Sufficient Conditions For Scale Independencementioning
confidence: 99%
“…al. [9], [10] formally defines the concept of query practicality through their definition of Π-tractability. A query Q is Π-tractable if there exists a polynomial-time algorithm to transform a data set D into D such that Q(D ) can be computed in polylog-time (O((log n) k ) for some k).…”
Section: Query Practicalitymentioning
confidence: 99%
“…To this end, we propose a notion of BD-tractable queries [8] , to help us determine what queries are tractable or feasible on big data.…”
Section: Querying Big Datamentioning
confidence: 99%
“…The revisions are defined in terms of computational costs [8] , communication (coordination) rounds [34][35] , or MapReduce steps [31] and data shipments [36] in the MapReduce framework [37] . Our notions of BD-tractability focus on computational costs [8] . The study is still preliminary, and a number of questions remain open.…”
Section: Open Issuesmentioning
confidence: 99%