In this paper, we examine a number of SQL and socalled "NoSQL" data stores designed to scale simple OLTP-style application loads over many servers. Originally motivated by Web 2.0 applications, these systems are designed to scale to thousands or millions of users doing updates as well as reads, in contrast to traditional DBMSs and data warehouses. We contrast the new systems on their data model, consistency mechanisms, storage mechanisms, durability guarantees, availability, query support, and other dimensions. These systems typically sacrifice some of these dimensions, e.g. database-wide transaction consistency, in order to achieve others, e.g. higher availability and scalability.
Performance is a major issue in the acceptance of object-oriented and relational database systems aimed at engineering applications such as computer-aided software engineering (CASE) and computer-aided design (CAD). Because traditional database systems benchmarks are inappropriate to measure performance for operations on engineering objects, we designed a new benchmark Object Operations version 1 (001) to focus on important characteristics of these applications. 001 is descended from an earlier benchmark for simple database operations and is based on several years experience with that benchmark.In this paper we describe the 001 benchmark and results we obtained running it on a variety of database systems. We provide a careful specification of the benchmark, show how it can be implemented on database systems, and present evidence that more than an order of magnitude difference in performance can result from a DBMS implementation quite different from current products: minimizing overhead per database call, offloading database server functionality to workstations, taking advantage of large main memories, and using link-based methods.
72 Co MMuniCAT ions of T h e AC M | j uNe 2 0 1 1 | vo L. 5 4 | No. 6 contributedarticles sharing the following features: ˲ ˲ Disk-oriented storage; ˲ ˲ Tables stored row-by-row on disk, hence, a row store; ˲ ˲ B-trees as the indexing mechanism; ˲ ˲ Dynamic locking as the concurrency-control mechanism; ˲ ˲ A write-ahead log, or WAL, for crash recovery; ˲ ˲ SQL as the access language; and ˲ ˲ A "row-oriented" query optimizer and executor, pioneered in System R. 7 The 1970s and 1980s were characterized by a single major DBMS market-business data processing-today called online transaction processing, or OLTP. Since then, DBMSs have come to be used in a variety of new markets, including data warehouses, scientific databases, social-networking sites, and gaming sites; the modern-day DBMS market is characterized in the figure here.The figure includes two axes: horizontal, indicating whether an application is read-focused or write-focused, and vertical, indicating whether an application performs simple operations (read or write a few items) or complex operations (read or write thousands of items); for example, the traditional OLTP market is write-focused with simple operations, while the data warehouse market is read-focused with complex operations. Many applit he relatioN a l M oD e l of data was proposed in 1970 by Ted Codd 5 as the best solution for the DBMS problems of the day-business data processing. Early relational systems included System R 2 and Ingres, 9 and almost all commercial relational DBMS (RDBMS) implementations today trace their roots to these two systems.As such, unless you squint, the dominant commercial vendors-Oracle, IBM, and Microsoft-as well as the major open source systems-MySQL and PostgreSQLall look about the same today; we term these systems general-purpose traditional row stores, or GPTRS, key insightsMany scalable sQL and nosQL datastores have been introduced over the past five years, designed for Web 2.0 and other applications that exceed the capacity of single-server RDBMss.Major differences characterize these new datastores as to their consistency guarantees, per-server performance, scalability for read versus write loads, automatic recovery from failure of a server, programming convenience, and administrative simplicity.Applications must be designed for scalability, partitioning application data into "shards," avoiding operations that span partitions, designing for parallelism, and weighing requirements for consistency guarantees.
Work with compiler compilers has dealt principally with automatic generation of parsers and lexical analyzers. Until recently, tittle work has been done on formalizing and generating the back end of a compiler, particularly an optimizing compiler. This paper describes formalizations of machines and code generators and describes a scheme for the automatic derivation of code generators from machine descriptions. It was possible to separate all machine dependence from the code generation algorithms for a wide range of typical architectures (IBM-360, PDP-11, PDP-10, Inte18080) while retaining good code quality. Heuristic search methods from work in artificial intelligence were found to be both fast and general enough for use in generation of code generators with the machine representation proposed. A scheme is proposed to perform as much analysis as possible at code generator generation time, resulting in a fast pattern-matching code generator. The algorithms and representations were implemented to test their practicality in use.
A user interface to a database designed for casual, interactive use is presented. The interface is ~nlity&zsed the data display to the user is based upon entities (e.g., persons, documents, organizations) that participate in relationships, rather than upon relations alone as in the relational data model. Examples from an implementation of the system are shown for a prototype personal database (PDB), developed in connection with the ZOG system at Carnegie-Mellon University (Robertson et al[1977D. Some details of the interface and associated issues concerning data display. data models, views, and knowledge-based assistance are presented Experience with the prototype system suggests that the entity-based presentation is appropriate for types of casual interactive use that existing database interfaces do not address, such as browsing. It is proposed that such an interface could be used to supplement a query language or other interface to allow users both kinds of views of the data.%lds work was performed in part while the author was employed at Carnegie Mellcm University and was sponsored by tbe Office of Naval Raearch under contrau NODW-76-0874. This paper was written at the author's present address at Xerox PARC Futtber work based cm tbesc ideas is now in progress. comments a~ solidted '
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.