An Efficient Cache Conscious Multi-dimensional Index Structure

Shim, Jeong Min; Song, Seok Il; Min, Young Soo; Yoo, Jae Soo

doi:10.1007/978-3-540-24768-5_93

Cited by 3 publications

(1 citation statement)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In parallel with this we foresee an increasing research effort directed towards answering multidimensional queries more efficiently with the use of novel indexing schemes that are tailored to queries that are typically expressed in data warehouses (Albrecht et al 2000). At the same time we also see a sustained interest in caching synopses of multi-dimensional queries (Shim et al, 2004, Park and Lee, 2005, Gemulla et al 2007 so that parts of the cache can be re-used across several such queries. We see the work presented in this chapter as complementing the research effort in such query optimization strategies.…”

Section: Future Trendsmentioning

confidence: 99%

Accelerating Multi Dimensional Queries in Data Warehouses

Pears

2009

Advances in Database Research

View full text Add to dashboard Cite

(Cunningham, Song and Chen, 2006, Elmasri andNavathe, 2003). Thus a marketing analyst is able to track variation in sales income across dimensions such as time period, location, and product on their own or in combination with each other. This analysis requires the processing of multi-dimensional aggregates and group by operations against the underlying data warehouse. Due to the large volumes of data that need to be scanned from secondary storage, such queries, referred to as On Line Analytical Processing (OLAP) queries, can take from minutes to hours in large scale data warehouses (Elmasri, 2003, Oracle 9i).The standard technique for improving query performance is to build aggregate tables that are targeted at known queries (Triantafillakis, Kanellis, and Martakos 2004;Elmasri, 2003). For example the identification of the top ten selling products can be speeded up by building a summary table that contains the total sales value (in dollar terms) for each of the products sorted in decreasing order of sales value. It would then be a simple matter of querying the summary table and retrieving the first ten rows. The main problem with this approach is the lack of flexibility. If the analyst now chooses to identify the bottom ten products an expensive re-sort would have to be performed to answer this new query. Worst still, if the information is to be tracked by sales location then the summary table would be of no value at all. This problem is symptomatic of a more general one where Database Systems which have been tuned for a particular access pattern perform poorly as changes to such patterns occur over a period of time. In their study (Zhen and Darmont, 2005) showed that database systems which have been optimized through clustering to suit particular query patterns rapidly degrade in performance when such query patterns change in nature.The limitations in the above approach can be addressed by a data compression scheme that preserves the original structure of the data. The chapter is organized as follows. In the next section we review related work. The next section introduces the Prime Factor Compression (PFC) approach. We then present the algorithms required for encoding and decoding with the PFC approach. The On Line reconstruction of Queries is discussed thereafter. Implementation related issues are then discussed, followed by a performance evaluation of PFC and a comparison with the Haar Wavelet algorithm. We then discuss future trends in optimizing multi-dimensional queries in the light of the results of this research. We conclude with a summary of the main achievements of the research. BACKGROUND

show abstract

Section: Future Trendsmentioning

confidence: 99%