Abstract. We describe a tool for physical database design based on a combination of theoretical and pragmatic approaches. The tool takes as input a relational schema, the workload defined on the schema, and some additional database characteristics and produces as output a physical schema. For the time being, the tool is tuned towards Ingres.
IntroductionThe design of databases takes place on several levels. One of these levels is the physical level. Typical subproblems on this level are, among others, selection of storage structures, secondary index selection, vertical fragmentation, materialization, etc. Solving these subproblems requires a sophisticated understanding of physical design options and query optimization strategies of the optimizer, and involve estimating query costs, which is a tedious and error-prone process when done manually. Moreover, several subproblems are NP-complete, such as the selection of an optimal set of secondary indices. Research in this area has been shifted to the problem of determining a good physical design instead of an optimal design [2,5]. A physical design is considered as good if a competent human database designer would produce the same or a worse design with the same available information. We present a tool, called TOPYDE, that takes as input, among others, a relational schema, the workload defined on the schema, and other database characteristics, such as page size, cardinality of a relation, etc., and produces for each relation a storage structure (including an ordering attribute or clustering index) and a set of secondary indices. This is called a physical schema. An overall physical schema is obtained by the union of the physical schema of each relation involved in the relational schema. For the time being, ordering attributes and indices concern single attributes, and a secondary index is stored as a Btree.Although TOPYDE does not cover the overall problem of physical design, it covers the most crucial parts. Moreover, TOPYDE can be easily extended with vertical fragmentation and materialization. We agree with Navathe et al. [7] that vertical partitioning precedes the selection of a physical schema. In [7], vertical fragmentation algorithms are presented that partition a relation into a set of fragments. Such algorithms can serve as a preprocessor for TOPYDE. In practice, materialization is often done as last; this means after the selection of a physical schema. So, TOPYDE can be extended by a postprocessor that aims to improve the physical schema selected by it.