2019
DOI: 10.1007/978-3-030-32686-9_31
|View full text |Cite
|
Sign up to set email alerts
|

Faster Repetition-Aware Compressed Suffix Trees Based on Block Trees

Abstract: Suffix trees are a fundamental data structure in stringology, but their space usage, though linear, is an important problem for its applications. We design and implement a new compressed suffix tree targeted to highly repetitive texts, such as large genomic collections of the same species. Our suffix tree builds on Block Trees, a recent Lempel-Ziv-bounded data structure that captures the repetitiveness of its input. We use Block Trees to compress the topology of the suffix tree, and augment the Block Tree node… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 37 publications
0
3
0
Order By: Relevance
“…We note that it is unknown if a result like Theorem VI.9 can be obtained on a semigroup. For example, it is not known how to compute the minimum of a substring in polylogarithmic time on block trees [43], [35].…”
Section: B Block Treesmentioning
confidence: 99%
“…We note that it is unknown if a result like Theorem VI.9 can be obtained on a semigroup. For example, it is not known how to compute the minimum of a substring in polylogarithmic time on block trees [43], [35].…”
Section: B Block Treesmentioning
confidence: 99%
“…al. [6], LCSA implementation of Cáceres and Navarro [3], and rlzsa-rand which denotes the best result in [15]. These indexes were selected because r-index is the smallest CSA, rlzsa-rand is the fastest CSA according to experiments [15], and LCSA also uses a differentially encoded SA combined with Re-Pair, instead of RLZ.…”
Section: Performance Comparisonmentioning
confidence: 99%
“…• We compare RLZ-compressed SAs to other state-of-art indexes for repetitive datasets, including those that attempt to exploit repetitions in SA d , such as the LCSA [3,9], showing rlzsa to be significantly faster, and usually smaller.…”
Section: Introductionmentioning
confidence: 99%