Modern in-memory database systems are facing the need of efficiently supporting mixed workloads of OLTP and OLAP. A conventional approach to this requirement is to rely on ETL-style, application-driven data replication between two very different OLTP and OLAP systems, sacrificing realtime reporting on operational data. An alternative approach is to run OLTP and OLAP workloads in a single machine, which eventually limits the maximum scalability of OLAP query performance. In order to tackle this challenging problem, we propose a novel database replication architecture called Asynchronous Parallel Table Replication (ATR). ATR supports OLTP workloads in one primary machine, while it supports heavy OLAP workloads in replicas. Here, rowstore formats can be used for OLTP transactions at the primary, while column-store formats are used for OLAP analytical queries at the replicas. ATR is designed to support elastic scalability of OLAP query performance while it minimizes the overhead for transaction processing at the primary and minimizes CPU consumption for replayed transactions at the replicas. ATR employs a novel optimistic lock-free parallel log replay scheme which exploits characteristics of multi-version concurrency control (MVCC) in order to enable real-time reporting by minimizing the propagation delay between the primary and replicas. Through extensive experiments with a concrete implementation available in a commercial database system, we demonstrate that ATR achieves sub-second visibility delay even for updateintensive workloads, providing scalable OLAP performance without notable overhead to the primary.
With a gigabyte of memory priced at less than $2,000, the main-memory DBMS (MMDBMS) is emerging as an economically viable alternative to the disk-resident DBMS (DRDBMS) in many problem domains. The MMDBMS can show significantly higher performance than the DRDBMS by reducing disk accesses to the sequential form of log writing and the occasional checkpointing. Upon the system crash, the recovery process begins by accessing the diskresident log and checkpoint data to restore a consistent state. With the increasing CPU speed, however, such disk access is still the dominant bottleneck in the MMDBMS. To overcome this bottleneck, this paper explores altematives of parallel logging and recoveiy.The major contribution of this paper is the so-called d$ferential logging scheme that permits unrestricted parallelism in logging and recovery. Using the bit-wise X O R operation both to compute the differential log between the before and after images and to recover the consistent database state, this scheme offers the mom for significant performance improvement in the MMDBMS. First, with logging done on the difference, the log volume is reduced to almost half compared with the conventional physical logging. Second, the commutativiq and associativity of XOR enables processing of log records in an arbitrary order: This means that we can freely distribute log records to multiple disks to improve the logging pelforniance. During the recovery time, we can do parallel restart independently f o r each log disk. This paper shows the superior performance of the differential logging comparatively with the physical logging in the shared-memory multiprocessor environment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.