2017
DOI: 10.1186/s13015-017-0117-9
|View full text |Cite
|
Sign up to set email alerts
|

Generalized enhanced suffix array construction in external memory

Abstract: BackgroundSuffix arrays, augmented by additional data structures, allow solving efficiently many string processing problems. The external memory construction of the generalized suffix array for a string collection is a fundamental task when the size of the input collection or the data structure exceeds the available internal memory.ResultsIn this article we present and analyze [introduced in CPM (External memory generalized suffix and arrays construction. In: Proceedings of CPM. pp 201–10, 2013)], the first … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
23
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 18 publications
(23 citation statements)
references
References 42 publications
(38 reference statements)
0
23
0
Order By: Relevance
“…The Burrows-Wheeler Transform (BWT) [ 36 ] is a well-known widely used reversible string transformation that can be extended to a collection of strings. Such an extension, known as eBWT or multi-string BWT, is a reversible transformation whose output string (denoted by ) is a permutation of the symbols of all strings in [ 21 ] (see also [ 30 , 31 , 34 , 47 , 48 ]). The length of is denoted by , and , with 1 ≤ i ≤ N , if x circularly precedes the i -th suffix S j [ k , n j +1] (for some 1≤ j ≤ m and 1≤ k ≤ n j + 1), according to the lexicographic sorting of the suffixes of all strings in .…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The Burrows-Wheeler Transform (BWT) [ 36 ] is a well-known widely used reversible string transformation that can be extended to a collection of strings. Such an extension, known as eBWT or multi-string BWT, is a reversible transformation whose output string (denoted by ) is a permutation of the symbols of all strings in [ 21 ] (see also [ 30 , 31 , 34 , 47 , 48 ]). The length of is denoted by , and , with 1 ≤ i ≤ N , if x circularly precedes the i -th suffix S j [ k , n j +1] (for some 1≤ j ≤ m and 1≤ k ≤ n j + 1), according to the lexicographic sorting of the suffixes of all strings in .…”
Section: Methodsmentioning
confidence: 99%
“…A [36] is a well-known widely used reversible string transformation that can be extended to a collection of strings. Such an extension, known as eBWT or multi-string BWT, is a reversible transformation whose output string (denoted by ebwt(S)) is a permutation of the symbols of all strings in S [21] (see also [30,31,34,47,48]). The length of ebwt(S) is denoted by N = m i=1 n i + m, and The set S will be omitted if it is clear from the context.…”
Section: Preliminaries and Materialsmentioning
confidence: 99%
“…The tool BCR+LCP cannot handle input sequences of different lengths, so for the pacbio dataset we used the tool extLCP [10] by the same authors. As a reference we also tested the external memory tool eGSA [23] that computes the Suffix and LCP arrays for a collection of sequences. However, we tested eGSA only using 32GB of RAM since the authors in [23] showed its running time degrades about 25 times when the RAM is restricted to the input size.…”
Section: Methodsmentioning
confidence: 99%
“…space economical, algorithms was recognized since the very beginning of the field [8,26,27]. When even lightweight algorithms do not fit in RAM, one has to resort to external memory construction algorithms (see [13,18,19,23] and references therein).…”
Section: Introductionmentioning
confidence: 99%
“…The longest common prefix (LCP) array of the collection S [30,18,24] is the array lcp(S) of length N + 1, such that lcp(S)[i], with 2 ≤ i ≤ N , is the length of the longest common prefix between the suffixes associated to the positions i and i − 1 in ebwt(S) and lcp(S)[1] = lcp(S)[N + 1] = −1 set by default. We denote by LCP(i, j) the length of the LCP between the suffixes associated with positions i and j in ebwt(S), i.e.…”
Section: Preliminariesmentioning
confidence: 99%