We study the randomized version of a computation model (introduced in [9, 10]) that restricts random access to external memory and internal memory space. Essentially, this model can be viewed as a powerful version of a data stream model that puts no cost on sequential scans of external memory (as other models for data streams) and, in addition, (like other external memory models, but unlike streaming models), admits several large external memory devices that can be read and written to in parallel.We obtain tight lower bounds for the decision problems set equality, multiset equality, and checksort. More precisely, we show that any randomized one-sided-error bounded Monte Carlo algorithm for these problems must perform Ω(logN) random accesses to external memory devices, provided that the internal memory size is at most O( 4 √ N/logN), where N denotes the size of the input data. From the lower bound on the set equality problem we can infer lower bounds on the worst case data complexity of query evaluation for the languages XQuery, XPath, and relational algebra on streaming data. More precisely, we show that there exist queries in XQuery, XPath, and relational algebra, such that any (randomized) Las Vegas algorithm that evaluates these queries must perform Ω(logN) random accesses to external memory devices, provided that the internal memory size is at most O( 4 √ N/log N).
Data exchange deals with the following problem: given an instance over a source schema, a specification of the relationship between the source and the target, and dependencies on the target, construct an instance over a target schema that satisfies the given relationships and dependencies. Recently-for data exchange settings without target dependencies-Libkin (PODS'06) introduced a new concept of solutions based on the closed world assumption (so called CWA-solutions), and showed that, in some respects, this new notion behaves better than the standard notion of solutions considered in previous papers on data exchange.The present paper extends Libkin's notion of CWA-solutions to data exchange settings with target dependencies. We show that, when restricting attention to data exchange settings with weakly acyclic target dependencies, this new notion behaves similarly as before: the core is the unique "minimal" CWA-solution, and computing CWA-solutions as well as certain answers to positive queries is possible in polynomial time and can be PTIME-hard. However, there may be more than one "maximal" CWA-solution. And going beyond the class of positive queries, we obtain that there are conjunctive queries with (just) one inequality, for which evaluating the certain answers is co-NP-hard. Finally, we consider the Existence-of-CWA-Solutions problem: while the problem is tractable for data exchange settings with weakly acyclic target dependencies, it turns out to be undecidable for general data exchange settings. As a consequence, we obtain that also the Existence-of-UniversalSolutions problem is undecidable in general.
We consider a scenario where we want to query a large dataset that is stored in external memory and does not fit into main memory. The most constrained resources in such a situation are the size of the main memory and the number of random accesses to external memory. We note that sequentially streaming data from external memory through main memory is much less prohibitive.We propose an abstract model of this scenario in which we restrict the size of the main memory and the number of random accesses to external memory, but admit arbitrary sequential access. A distinguishing feature of our model is that it allows the usage of unlimited external memory for storing intermediate results, such as several hard disks that can be accessed in parallel.In this model, we prove lower bounds for the problem of sorting a sequence of strings (or numbers), the problem of deciding whether two given sets of strings are equal, and two closely related decision problems. Intuitively, our results say that there is no algorithm for the problems that uses internal memory space bounded by N 1−ε and at most o(log N ) random accesses to external memory, but unlimited "streaming access", both for writing to and reading from external memory. (Here, N denotes the size of the input and ε is an arbitrary constant greater than 0.) We even permit randomized algorithms with one-sided bounded error. We also consider the problem of evaluating database queries and prove similar lower bounds for evaluating relational algebra queries against relational databases and XQuery and XPath queries against XML-databases. Abstract Devices]: Complexity Measures and Classes-relations among complexity classes and This article combines and refines Grohe and Schweikardt [2005], presented at the 24th ACM
Data exchange deals with translating data structured in some source format into data structured in some target format, given a specification of the relationship between the source and the target and possibly constraints on the target; and answering queries over the target in a way that is semantically consistent with the information in the source. Theoretical foundations of data exchange have been actively explored recently. It was also noticed that the standard semantics for query answering in data exchange may lead to counterintuitive or anomalous answers.In the present article, we explain that this behavior is due to the fact that solutions can contain invented information (information that is not related to the source instance), and that the presence of incomplete information in target instances has been ignored. In particular, proper query evaluation techniques for databases with nulls have not been used, and the distinction between closed and open world semantics has not been made.We present a concept of solutions, called CWA-solutions, that is based on the closed world assumption. For data exchange settings without constraints on the target, the space of CWA-solutions has two extreme points: the canonical universal solution (the maximal CWA-solution) and the core of the universal solutions (the minimal CWA-solution), both of them well studied in data exchange. In the presence of constraints on the target, the core of the universal solutions is still the minimal CWA-solution, but there may be no unique maximal CWA-solution. We show how to define the semantics of query-answering taking into account incomplete information, and show that some of the well-known anomalies go away with the new semantics. The article also contains results on the complexity of query-answering, upper approximations to queries (maybe-answers), and various extensions.
Abstract. We extend first-order logic with counting by a new operator that allows it to formalise a limited form of recursion which can be evaluated in logarithmic space. The resulting logic LREC has a data complexity in LOGSPACE, and it defines LOGSPACEcomplete problems like deterministic reachability and Boolean formula evaluation. We prove that LREC is strictly more expressive than deterministic transitive closure logic with counting and incomparable in expressive power with symmetric transitive closure logic STC and transitive closure logic (with or without counting). LREC is strictly contained in fixed-point logic with counting FP+C. We also study an extension LREC= of LREC that has nicer closure properties and is more expressive than both LREC and STC, but is still contained in FP+C and has a data complexity in LOGSPACE.Our main results are that LREC captures LOGSPACE on the class of directed trees and that LREC= captures LOGSPACE on the class of interval graphs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.