This report examines the relative advantages of a storage model based on decomposition (of community view relations into binary relations containing a surrogate and one attribute) over conventional n-ary storage models
This paper examines the problem of data placement in Bubba, a highly-parallel system for data-intensive applications being developed at MCC. “Highly-parallel” implies that load balancing is a critical performance issue. “Data-intensive” means data is so large that operations should be executed where the data resides. As a result, data placement becomes a critical performance issue. In general, determining the optimal placement of data across processing nodes for performance is a difficult problem. We describe our heuristic approach to solving the data placement problem in Bubba. We then present experimental results using a specific workload to provide insight into the problem. Several researchers have argued the benefits of declustering (i e, spreading each base relation over many nodes). We show that as declustering is increased, load balancing continues to improve. However, for transactions involving complex joins, further declustering reduces throughput because of communications, startup and termination overhead. We argue that data placement, especially declustering, in a highly-parallel system must be considered early in the design, so that mechanisms can be included for supporting variable declustering, for minimizing the most significant overheads associated with large-scale declustering, and for gathering the required statistics.
Identity is that property of an object which distinguishes each object from all others. Identity has been investigated almost independently in general-purpose programming languages and database languages. Its importance is growing as these two environments evolve and merge.We describe a continuum between weak and strong support of identity, and argue for the incorporation of the strong notion of identity at the conceptual level in languages for general purpose programming, database systems and their hybrids. We define a data model that can directly describe complex objects, and show that identity can easily be incorporated in it.Finally, we compare d~erent implementation schemes for identity and argue that a surrogate-based implementation scheme is needed to support the strong notion of identity.
Abstracf-Since 1984, the goal of the Bubba project at MCC has been to design a scalable, high-performance and highly available database system that will provide significant costlperformance advantages over conventional mainframes in the 1990's. The design process has been an iterative one, cycling through design, modeling, and prototyping in progressive detail. The current Bubba prototype runs on a commercial 40-node multicomputer and includes a parallelizing compiler, distributed transaction management, object management, and a customized version of UNIX. This paper describes the current prototype and discusses of the major design decisions that went into its construction. The lessons learned from this prototype and its predecessors are presented.Index Terms-Complex object management, database operating system, database programming language, database system performance, database system prototype, dataflow execution, parallel database system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.