Processing multi-join query in parallel systems

Tan, Kian-Lee; Lü, Hongjun

doi:10.1145/143559.143653

Cited by 6 publications

(4 citation statements)

References 12 publications

(14 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, most of the research done [31], [41], [44], [47] on efficient computation of join in distributed databases has been restricted to equi-joins, join of two tables, minimizing computation time, static relational tables, and/or complete or regular topology.…”

Section: Problem Description Motivation and Related Workmentioning

confidence: 99%

Join of Multiple Data Streams in Sensor Networks

Zhu

Gupta

Tang

2009

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

Sensor networks are multihop wireless networks of resource-constrained sensor nodes used to realize high-level collaborative sensing tasks. To query or access data generated by the sensor nodes, the sensor network can be viewed as a distributed database. In this paper, we develop algorithms for communication-efficient implementation of join of multiple (two or more) data streams in a sensor network. The distributed implementation of join in sensor networks is particularly challenging due to unique characteristics of the sensor networks such as limited memory and battery energy on individual nodes, arbitrary and dynamic network topology, multihop communication, and unreliable infrastructure. One of our proposed approaches, viz., the Perpendicular Approach (PA), is load balanced, and in fact, incurs near-optimal communication cost for the special case of binary joins in grid networks under the assumption of uniform generation of tuples across the network. We compare the performance of our designed approaches through extensive simulations on the ns2 simulator, and show that PA results in substantially prolonging the network lifetime compared to other approaches, especially for joins involving spatial constraints.Index Terms-Distributed query processing, Sensor networks. Ç INTRODUCTIONS ENSOR networks are multihop wireless networks formed by a large number of resource-constrained sensor nodes. Each sensor node typically generates a stream of data items that are readings obtained from the sensing devices on the node. This motivates modeling the data in a sensor network as relational data streams, and visualizing sensor networks as distributed databases systems [5], [17]. More recently, recursive deductive approach has been suggested as a framework for programming sensor networks [9]. Like a database, the sensor network can be queried, and efficient innetwork (distributed) implementation of database queries is of great importance. Join is an important database operator, and as shown in our concurrent work [23], can form a basis of a deductive query engine for sensor networks. In particular, join operator can be used to represent complex events in sensor networks [1], [29]. Thus, efficient implementation of join in sensor networks is of great significance; the challenge comes from limited network resources.Motivated by the above, we develop efficient distributed implementations for join of multiple data streams in sensor networks. Since each sensor node has limited battery energy and message communication is the main consumer of energy, distributed implementation of join must minimize the communication cost. In particular, we are interested in in-network implementation strategies since routing all sensor data to a central server would incur prohibitive communication costs. In addition, load-balanced implementation strategies are highly desirable, because unbalanced strategies are likely to result in a much shorter network lifetime. Design of communication-efficient and load-balanced in-network implementations of join in...

show abstract

Section: Problem Description Motivation and Related Workmentioning

confidence: 99%

Join of Multiple Data Streams in Sensor Networks

Zhu

Gupta

Tang

2009

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

show abstract

“…One of the pioneering results on this issue which was related to sequential processing was reported in [7]. Since many applications require access to multiple relations which involve many join operations, the parallelism in such multi-join queries has the potential to improve DBMS performance.DBMS offers several sources for extracting parallelism: (i) intra-operator parallelismwithin a join operation where the join operations are processed one at a time in parallel [17]. In [18] a solution for that problem was tailored for hypercube topology; (ii) inter-operator parallelism -where several operations are processed simultaneously [5,19].…”

mentioning

confidence: 99%

“…DBMS offers several sources for extracting parallelism: (i) intra-operator parallelismwithin a join operation where the join operations are processed one at a time in parallel [17]. In [18] a solution for that problem was tailored for hypercube topology; (ii) inter-operator parallelism -where several operations are processed simultaneously [5,19].…”

mentioning

confidence: 99%

Scheduling tasks of multi‐join queries in a multiprocessor

Averbuch

Roditty

Shoham

1999

Concurrency: Pract. Exper.

View full text Add to dashboard Cite

This paper deals with the problem of scheduling spawned tasks when a query is issued to a database which resides on a MIMD multiprocessor. These tasks have the property that their associated dependency scheme can be presented as a directed tree. We present a theoretical framework with extensive experimental simulations which increase the throughput of database applications. We derive a family of algorithms for scheduling tasks. Their performance is tested on several common multiprocessor configurations. For better performance the adaptation of the scheduling algorithm to the multiprocessor configuration is examined and analyzed. The scheduling algorithms are divided into two cases: (a) permitted changes in the resources connection scheme of the multiprocessor, and (b) no changes allowed. The algorithms are scalable and their complexity is computed. In particular, we present an algorithm for scheduling tasks in the case where the construction of a central storage location is permitted. One of the main tools for the construction of the above algorithms is the notion of (t, 1)‐domination and k‐domination sets. Copyright © 1999 John Wiley & Sons, Ltd.

show abstract

“…It also minimizes the communication cost since we do not have to migrate the tuples of the operand relations in order to concentrate the concurrent join operations to different disjoint sets of the PNs. A similar technique was independantly proposed in [21] for the shared-disk architecture.…”

Section: Select-relation-pairsmentioning

confidence: 99%

Including the load balancing issue in the optimization of multi-way join queries for shared-nothing database computers

Srivastava

Elsesser

[1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems

View full text Add to dashboard Cite

A consensus on parallel architecture f o r very large database management has emerged. This architecture is based on a shared-nothing hardware organization. This computation model, however, is very sensitive to the skewness in the tuple distribution. Recently, several parallel join algorithms with dynamic load balancing capabilities have been proposed to address this issue. However, none of these algorithms consider the multi-way join problems. In this paper, we propose a dynamic load balancing technique f o r multi-way joins, and investigate the effect of load balancing on query optimization.14 0-8186-3330-1/93 $03.00 0 1993 IEEE

show abstract

Processing multi-join query in parallel systems

Cited by 6 publications

References 12 publications

Join of Multiple Data Streams in Sensor Networks

Join of Multiple Data Streams in Sensor Networks

Scheduling tasks of multi‐join queries in a multiprocessor

Including the load balancing issue in the optimization of multi-way join queries for shared-nothing database computers

Contact Info

Product

Resources

About