2017
DOI: 10.1007/978-3-319-67428-5_26
|View full text |Cite
|
Sign up to set email alerts
|

Linear-Size CDAWG: New Repetition-Aware Indexing and Grammar Compression

Abstract: In this paper, we propose a novel approach to combine compact directed acyclic word graphs (CDAWGs) and grammar-based compression. This leads us to an efficient self-index, called Linear-size CDAWGs (L-CDAWGs), which can be represented with O(ẽT log n) bits of space allowing for O(log n)-time random and O(1)-time sequential accesses to edge labels, and O(m log σ + occ)-time pattern matching. Here,ẽT is the number of all extensions of maximal repeats in T , n and m are respectively the lengths of the text T and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
29
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(29 citation statements)
references
References 20 publications
0
29
0
Order By: Relevance
“…By using O(r log log w (σ + n/r)) space, we obtain optimal locate time in the general setting, O(m + occ), as well as optimal counting time, O(m). This had been obtained before only with space bounds O(e) [7] or O(e) [111]. 4.…”
mentioning
confidence: 70%
See 1 more Smart Citation
“…By using O(r log log w (σ + n/r)) space, we obtain optimal locate time in the general setting, O(m + occ), as well as optimal counting time, O(m). This had been obtained before only with space bounds O(e) [7] or O(e) [111]. 4.…”
mentioning
confidence: 70%
“…4. Self-indexes with efficient extraction require Ω(z log(n/z)) space [105,21,43,10,15], Ω(g) space [17,14], or Ω(e) space [111,7]. 5.…”
Section: Indexmentioning
confidence: 99%
“…20] O(r + n/s) O(( + s)( log σ log log r + (log log n) 2 )) This paper (Thm. 2) O(r log(n/r)) O(log(n/r) + log(σ)/w) Takagi et al [94,Thm. 9] O(e) O(log n + ) Belazzougui O(r log(n/r)) O(log(n/r) + log(σ)/w) Access SA, ISA, LCP (Thm.…”
Section: Index Spacementioning
confidence: 99%
“…to edges in the suffix tree of T . This parallels the grammar implicit in [6] and explicit in [21], whose nonterminals correspond to unary paths in the suffix trie of T , i.e. to edges in the suffix tree of T .…”
Section: Cdawgmentioning
confidence: 70%
“…We achieve this by dropping the run-length-encoded representation of the Burrows-Wheeler transform of T , used in [2], and by exploiting the fact that the reversed CDAWG induces a context-free grammar that produces T and only T , as described in [1]. A related grammar, already implicit in [6], has been concurrently exploited in [21] to achieve similar bounds to ours. Note that in some strings, for example in the family T i for i ≥ 0, where T 0 = 0 and T i = T i−1 iT i−1 , the length of the string grows exponentially in the size of the CDAWG, thus shaving an O(log log n) term is identical to shaving an O(log e T ) term.…”
Section: Introductionmentioning
confidence: 99%