The Haar+ Tree: A Refined Synopsis Data Structure

Karras, Panagiotis; Mamoulis, Nikos

doi:10.1109/icde.2007.367889

Cited by 22 publications

(72 citation statements)

References 29 publications

(49 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The synopsis construction model based on the Haar + tree [26]. This model supersedes previous wavelet-based techniques [9,13].…”

mentioning

confidence: 91%

“…For example, if we occupy node c4 in Figure 1, then it is allowed to occupy any of its descendant nodes, as well as nodes that either fully contain, or are disjoint from, range R4, i.e., nodes c0, c1, c2, c28 and c35. The approximation of a data value di represented by an LH is constructed as the value of the lowest occupied node affecting di, by means of an interval tree; hence, data reconstruction requires O(log B) time (as for other summarization techniques [32,16,26,24]). An optimal LH synopsis of D in space B should achieve the minimum error * achievable in B space for the employed error metric.…”

Section: The Lattice Histogrammentioning

confidence: 99%

“…To copy otherwise, or to republish, to post on servers or to redistribute to lists, requires a fee and/or special permission from the publisher, ACM. VLDB '09, August [24][25][26][27][28]2009, Lyon, France Copyright 2009 VLDB Endowment, ACM 000-0-00000-000-0/00/00. such a representation arises oftentimes in applications such as distributed stream monitoring [39], approximate query answering [2,36,22,5], query optimization [32], OLAP/DSS systems [41], time-series indexing [6], and data mining [31].…”

Section: Introductionmentioning

confidence: 99%

“…The former, histogrambased techniques [18,21,38,37,23,22,36,11,16,14,40,12], summarize the data by dividing it into consecutive intervals, or buckets; typically, a bucket is assigned a single representative value that approximates the data therein; variations to this theme aim to optimize the data representation within a bucket [30,4,42]. The latter methods utilize a predefined hierarchical tree structure such as that defined by the Haar wavelet decomposition [32,41,5,8,9,25,35,12,13] or alternatives that follow a similar pattern [39,26,24].…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Optimality and scalability in lattice histogram construction

Karras

2009

Proc. VLDB Endow.

Self Cite

View full text Add to dashboard Cite

The Lattice Histogram is a recently proposed data summarization technique that achieves approximation quality preferable to that of an optimal plain histogram. Like other hierarchical synopsis methods, a lattice histogram (LH) aims to approximate data using a hierarchical structure. Still, this structure is not defined a priori; it consists an unknown, not a given, of the problem. Past work has defined the properties that an LH needs to obey and developed general-purpose approximation algorithms for the construction thereof. Still, two major issues remain unaddressed: First, the construction of an optimal LH for a given error metric is a problem unsolved to date. Second, the proposed algorithms suffer from too high space and time complexities that render their application in real-world settings problematic. In this paper, we address both these questions, focusing on the case that the target error metric is a maximum error metric. Our algorithms treat both the error-bounded LH construction problem, in which the space occupied by an LH is minimized under an error constraint, as well as the classic space-bounded problem. First, we develop a dynamicprogramming scheme that detects an optimal LH under a given maximum-error bound. Second, we propose an efficient, practical, greedy algorithm that solves the same problem with much lower time and space requirements. Then, we show how both our algorithms can be applied to the classic space-bounded problem, aiming at minimizing error under a bound on space. Our experimental study with real-world data sets shows the effectiveness of our methods compared to competing summarization techniques. Moreover, our findings show that our greedy heuristic performs almost as well as the optimal solution in terms of accuracy.

show abstract

“…The synopsis construction model based on the Haar + tree [26]. This model supersedes previous wavelet-based techniques [9,13].…”

mentioning

confidence: 91%

Section: The Lattice Histogrammentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Optimality and scalability in lattice histogram construction

Karras

2009

Proc. VLDB Endow.

Self Cite

View full text Add to dashboard Cite

show abstract

“…The former is presented in [52] [52]. Following the same approach used in the experiments of [54], and coherently with the approximate nature of [45,48], we compare our method with [49] (since the chosen metrics is the relative error). This obviously works as (indirect) comparison with [45,48], which present the best solution of the trade-off between feasibility and closeness to the optimal accuracy.…”

Section: Related Workmentioning

confidence: 99%