2015
DOI: 10.1016/j.ipl.2015.04.002
|View full text |Cite
|
Sign up to set email alerts
|

Constructing LZ78 tries and position heaps in linear time for large alphabets

Abstract: We present the first worst-case linear-time algorithm to compute the Lempel-Ziv 78 factorization of a given string over an integer alphabet. Our algorithm is based on nearest marked ancestor queries on the suffix tree of the given string. We also show that the same technique can be used to construct the position heap of a set of strings in worst-case linear time, when the set of strings is given as a trie.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 15 publications
(17 citation statements)
references
References 17 publications
0
17
0
Order By: Relevance
“…Since LZ78 factors are naturally represented in a trie, the so-called LZ trie, improving LZ78 computation can be done, among others, by using sophisticated trie implementations [9,10], or by superimposing the suffix tree with the suffix trie [11,12]. We follow the latter approach.…”
Section: Related Workmentioning
confidence: 99%
“…Since LZ78 factors are naturally represented in a trie, the so-called LZ trie, improving LZ78 computation can be done, among others, by using sophisticated trie implementations [9,10], or by superimposing the suffix tree with the suffix trie [11,12]. We follow the latter approach.…”
Section: Related Workmentioning
confidence: 99%
“…Here, we consider answering substring compression queries with the LZ78 factorization (which is actually also an SLP ( [44], to finally apply the aforementioned algorithm of Bannai et al [43]. The fastest LZ78 factorization algorithms [4,45] can answer a LZ78 substring compression query in O(|I|) time alphabet independently. For small alphabet sizes, the running time O(|I|(lg lg |I|) 2 /(log σ |I| lg lg lg |I|)) of the LZ78 factorization algorithm of Jansson et al [46] becomes even sub-linear in |I|.…”
Section: Related Substring Compression Query Problemsmentioning
confidence: 99%
“…To obtain similar bounds for LZ78, we could adapt the approach of Bille et al [36] to preprocess the LZ78 factorization of all suffixes of T, but that would give us a data structure with super-linear preprocessing time (and possibly super-linear space). Here, we borrow the idea from Nakashima et al [45] to superimpose the suffix tree with the LZ78 trie, and use a data structure for answering nearest marked ancestor queries to find the lowest marked suffix tree node on the path from the root to a leaf. This data structure [47] takes O(n lg n) bits of space, and can answer a nearest marked ancestor query in O(1) amortized time.…”
Section: Related Substring Compression Query Problemsmentioning
confidence: 99%
See 1 more Smart Citation
“…] and in practice [14, 13, e.g. ], only recent interest in LZ78 can be observed: just in 2015 Nakashima et al [26] gave the first (theoretical) linear time algorithm for LZ78. On the practical side, we are not aware of any systematic study.…”
Section: Introductionmentioning
confidence: 99%