2015
DOI: 10.1016/j.tcs.2015.01.019
|View full text |Cite
|
Sign up to set email alerts
|

Compressed automata for dictionary matching

Abstract: We address a variant of the dictionary matching problem where the dictionary is represented by a straight line program (SLP). For a given SLP-compressed dictionary D of size n and height h representing m patterns of total length N, we present an O (n 2 log N)-size representation of Aho-Corasick automaton which recognizes all occurrences of the patterns in D in amortized O (h + m) running time per character. We also propose an algorithm to construct this compressed Aho-Corasick automaton in O (n 3 log n log N) … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 14 publications
0
4
0
Order By: Relevance
“…We estimated the effectiveness of the compression using the size of the generated grammars instead of the length of the output bits. Reducing the grammar size has important implications since the majority of the existing text algorithms applied to grammar-compressed texts, including grammar-based self indexes [21,22], edit distance computation [23], q-gram mining [24,25], and pattern matching [26][27][28], have time/space complexities that are dependent on the input grammar size. For instance, the compressed indexes proposed by Claude and Navarro [21,22] can be directly built on MR-RePair grammar-compressed texts.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…We estimated the effectiveness of the compression using the size of the generated grammars instead of the length of the output bits. Reducing the grammar size has important implications since the majority of the existing text algorithms applied to grammar-compressed texts, including grammar-based self indexes [21,22], edit distance computation [23], q-gram mining [24,25], and pattern matching [26][27][28], have time/space complexities that are dependent on the input grammar size. For instance, the compressed indexes proposed by Claude and Navarro [21,22] can be directly built on MR-RePair grammar-compressed texts.…”
Section: Discussionmentioning
confidence: 99%
“…Our experiments show that MR-RePair constructs smaller grammars compared to RePair. We emphasize that generating a grammar of small size is of great importance since most, if not all, existing algorithms/data structures that work on grammar-compressed texts have running time dependent on the grammar sizes (see e.g., [21][22][23][24][25][26][27][28] and the references therein) and not directly on the encoded sizes.…”
Section: Introductionmentioning
confidence: 99%
“…We can obtain a faster algorithm using Theorem 6. We can also solve the grammar compressed dictionary matching problem [17] with our data structures. We preprocess an input dictionary SLP (DSLP) S, m with n productions that represent m patterns.…”
Section: Applicationsmentioning
confidence: 99%
“…Hon et al 51 achieved an entropy compressed space while matching time remains optimal. Tomohiro et al 52 designed a matching algorithm working on grammar‐based compressed AC automata. However, these studies were accomplished through theoretical discussions, and we are unaware of any actual implementation.…”
Section: Related Workmentioning
confidence: 99%