S. Bing Yao scite author profile

The B-tree and its variants have been found to be highly useful (both theoretically and in practice) for storing large amounts ofinformation, especially on secondary storage devices. We examine the problem of overcoming the inherent difficulty of concurrent operations on such structures, using a practical storage model. A single additional "link" pointer in each node allows a process to easily recover from tree modifications performed by other concurrent processes. Our solution compares favorably with earlier solutions in that the locking scheme is simpler (no read-locks are used) and only a (small) constant number of nodes are locked by any update process at any given time. An informal correctness proof for our system is given,

show abstract

Approximating block accesses in database organizations

Yao

1977

Commun. ACM

316

View full text Add to dashboard Cite

When data records are grouped into blocks in secondary storage, it is frequently necessary to estimate the number of blocks XD accessed for a given query. In a recent paper [Ij, Cardenas gave the expression = m{\ -(I - (1) assuming that there are n records divided into m blocks and that the k records satisfying the query are distributed uniformly among the m blocks. The derivation of the expression was left to the reader as an exercise.Let us take a closer look at the expression.(1 -1/m) gives the probability that a particular block does not contain a particular record. If k records are selected independently, then the probability that a particular block not being "hit" is given by (1 -!/m)*. Hence I -(I -\/mY gives the probability that a particular block is "hit," and the expression follows.The assumption that the k records are selected independently implies selection with replacement. Since a record may be selected more than once, the k records may not be distinct. This is not valid in the case of a query access which retrieves all k distinct records at one time. In fact, Rothnie and Lozano showed that the result of eq. (1) gives the lower bound of the expected number of blocks accessed [2]. A more accurate analysis based on selection without replacement was given by Severance, but the precision problem makes the expression obtained computationally intractable (Appendix D in [3]). A similar approach by Siler results in a rather complicated recursive formula which can be computed (Appendix B in [4]). Another recursive formula was given by [3]. Using a different 260 approach, a simple closed form was obtained by Yao in a different context [5). The resulting expression was used in several applications [5,6,7] to estimate the expected number of data blocks accessed. Comparing this to the Cardenas approximation, it is shown that this refinement is significant when the blocking factor n/m is small. For large blocking factors (e.g. n/m > 10), the error involved in Cardenas' approximation is practically negligible. THEOREM (Yao). Given n records grouped into m blocks {I < m < n), each contains n/m records. If k records {k < n -n/m) are randomly selected from the n records, the expected number of blocks hit {blocks with at least one record selected) is given by m L ,=i n -I -\-\_\

show abstract

Optimization Algorithms for Distributed Queries

Apers

Hevner

Yao

1983

IIEEE Trans. Software Eng.

196

View full text Add to dashboard Cite

Query Processing in Distributed Database System

Hevner¹,

Yao

1979

IIEEE Trans. Software Eng.

176

View full text Add to dashboard Cite

An attribute based model for database access cost analysis

Yao

1977

ACM Trans. Database Syst.

View full text Add to dashboard Cite

A generalized model for physical database organizations is presented. Existing database organizations are shown to fit easily into the model as special cases. Generalized access algorithms and cost equations associated with the model are developed and analyzed. The model provides a general design framework in which the distinguishing properties of database organizations are made explicit and their performances can be compared.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

S. Bing Yao

Efficient locking for concurrent operations on B-trees

Approximating block accesses in database organizations

Optimization Algorithms for Distributed Queries

Query Processing in Distributed Database System

An attribute based model for database access cost analysis

Contact Info

Product

Resources

About