Donald Kossmann scite author profile

Distributed data processing is becoming a reality. Businesses want to do it for many reasons, and they often must do it in order to stay competitive. While much of the infrastructure for distributed data processing is already there (e.g., modern network technology), a number of issues make distributed data processing still a complex undertaking: (1) distributed systems can become very large, involving thousands of heterogeneous sites including PCs and mainframe server machines; (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites are added to the system; (3) legacy systems need to be integrated—such legacy systems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distributed database and information systems. The paper presents the “textbook” architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems. These techniques include special join techniques, techniques to exploit intraquery paralleli sm, techniques to reduce communication costs, and techniques to exploit caching and replication of data. Furthermore, the paper discusses different kinds of distributed systems such as client-server, middleware (multitier), and heterogeneous database systems, and shows how query processing works in these systems.

show abstract

ALEX: An Updatable Adaptive Learned Index

Ding

Minhas

et al. 2020

163

118

View full text Add to dashboard Cite

An evaluation of alternative architectures for transaction processing in the cloud

2010

View full text Add to dashboard Cite

On saying “Enough already!” in SQL

1997

View full text Add to dashboard Cite

In this paper we study a simple SQL extension that enables query writers to explicitly limit the cardinali~of a query result.We examine its impact on the query optimization and run-time execution components of a relational DBMS, presenting two apprwachesConservalive approach and an Aggressive appraactio exploiting cat-dinality limits in relational query plans. Results obtained fmm ml empirical study conducted using DB2 demonstrate the benefits of the SQL extension and illuslrafe the tradeofs between our two approaches to implementing it.

show abstract

Building a database on S3

Brantner¹,

Florescu

Graf³

et al. 2008

194

105

View full text Add to dashboard Cite

Consistency rationing in the cloud

et al. 2009

View full text Add to dashboard Cite

Cloud storage solutions promise high scalability and low cost. Existing solutions, however, differ in the degree of consistency they provide. Our experience using such systems indicates that there is a non-trivial trade-off between cost, consistency and availability. High consistency implies high cost per transaction and, in some situations, reduced availability. Low consistency is cheaper but it might result in higher operational cost because of, e.g., overselling of products in a Web shop. In this paper, we present a new transaction paradigm, that not only allows designers to define the consistency guarantees on the data instead at the transaction level, but also allows to automatically switch consistency guarantees at runtime. We present a number of techniques that let the system dynamically adapt the consistency level by monitoring the data and/or gathering temporal statistics of the data. We demonstrate the feasibility and potential of the ideas through extensive experiments on a first prototype implemented on Amazon's S3 and running the TPC-W benchmark. Our experiments indicate that the adaptive strategies presented in the paper result in a significant reduction in response time and costs including the cost penalties of inconsistencies.

show abstract

Integrating keyword search into XML query processing

2000

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Donald Kossmann

The Skyline operator

The state of the art in distributed query processing

ALEX: An Updatable Adaptive Learned Index

An evaluation of alternative architectures for transaction processing in the cloud

On saying “Enough already!” in SQL

Building a database on S3

Consistency rationing in the cloud

Integrating keyword search into XML query processing

Contact Info

Product

Resources

About