Abstract-Checkpointing and rollback recovery are well-known techniques for handling failures in distributed database systems. In this paper, we establish the necessary and sufficient conditions for the checkpoints on a set of data items to be part of a transaction-consistent global checkpoint of the distributed database. This can throw light on designing efficient, non-intrusive checkpointing techniques and transparent recovery techniques for distributed database systems.
Phase change memory (PCM) has been considered an attractive alternative to flash memory and DRAM. It has promising features, including non-volatile storage, byte addressability, fast read and write operations, and supports random accesses. However, there are challenges in designing algorithms for PCM-based memory systems, such as longer write latency and higher energy consumption compared to DRAM. In this paper, we propose a new predictive B + -tree index, called the B p -tree, which is tailored for database systems that make use of PCM. Our B p -tree reduces data movements caused by tree node splits and merges that arise from insertions and deletions. This is achieved by pre-allocating space on PCM for near future data. To ensure the space are allocated where they are needed, we propose a novel predictive model to ascertain future data distribution based on the current data. In addition, as in [4], when keys are inserted into a leaf node, they are packed but need not be in sorted order. We have implemented the B p -tree in PostgreSQL and evaluated it in an emulated environment. Our experimental results show that the B p -tree significantly reduces the number of writes, therefore making it write and energy efficient and suitable for a PCM-like hardware environment.
In corporate networks, daily business data are generated in gigabytes or even terabytes. It is costly to process aggregate queries in those systems. In this paper, we propose PACA, a probably approximately correct aggregate query processing scheme, for answering aggregate queries in structured Peer-to-Peer (P2P) network. PACA retrieves random samples from peers' databases and applies the samples to process queries. Instead of scanning the entire database of each peer, PACA only accesses a small random number of data. Moreover, based on the query distribution, PACA publishes a precomputed synopsis and uses the synopsis to answer future queries. Most queries are expected to be answered by the precomputed synopsis partially or fully. And the synopsis is adaptively tuned to follow the query distribution. Experiments on the PlanetLab show the effectiveness of the approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.