2013
DOI: 10.1007/978-3-642-39206-1_14
|View full text |Cite
|
Sign up to set email alerts
|

Tree Compression with Top Trees

Abstract: We introduce a new compression scheme for labeled trees based on top trees. Our compression scheme is the first to simultaneously take advantage of internal repeats in the tree (as opposed to the classical DAG compression that only exploits rooted subtree repeats) while also supporting fast navigational queries directly on the compressed representation. We show that the new compression scheme achieves close to optimal worst-case compression, can compress exponentially better than DAG compression, is never much… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

1
74
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 22 publications
(75 citation statements)
references
References 25 publications
(39 reference statements)
1
74
0
Order By: Relevance
“…A top-tree compression [4] of a tree T is a DAG compression of T 's top-tree T . Bille et al [4] showed how to construct a data structure whose size is linear in the size of the DAG of T and supports navigational queries on T in time linear in the depth of T .…”
Section: Our Results and Techniquesmentioning
confidence: 99%
“…A top-tree compression [4] of a tree T is a DAG compression of T 's top-tree T . Bille et al [4] showed how to construct a data structure whose size is linear in the size of the DAG of T and supports navigational queries on T in time linear in the depth of T .…”
Section: Our Results and Techniquesmentioning
confidence: 99%
“…The random access problem is to preprocess a data set into a compressed representation that supports fast retrieval of any part of the data without decompressing the entire data set. The random access problem is a well-studied problem for many types of data and compression schemes [1,3,5,8,9,19,31,35,41,48,53] and random access queries is a basic primitive in several algorithms and data structures on compressed data, see e.g., [7,9,23,24,25] In this paper, we consider the random access problem on collections of strings where each string is the result of an edit operation, i.e., inserting, delete, or replace a single character, from another string in the collection. Specifically, our collection is given by a rooted tree, called a version tree, where edges are labeled by an edit operation and a node represents the string obtained by applying the sequence of edit operation on the path from the root to the node (see Figure 1(a)).…”
Section: Introductionmentioning
confidence: 99%
“…< l a t e x i t s h a 1 _ b a s e 6 4 = " J C j c M U a u x U w S I g V 6 q d K q [4,4], (v 1 , v 5 ) in [8,12], and (v 5 , v 6 ) in [9,11]. (c) The segment selection instance corresponding to (a).…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…A tree is one of the most important and popular structures in computing, used to represent the relations between nodes. Therefore, there has been considerable research on succinct representation of trees while allowing various operations on these trees to be efficiently performed, see (Chen and Reif, 1996), (Bille et al, 2013), (Jacobson, 1989) and (Benoit et al, 1999). The eXtensible Markup Language, XML (XML, 2013), is one of the most popular data formats for the serialization of tree data structures and for the storage of relational data.…”
Section: Introductionmentioning
confidence: 99%