2011 First International Conference on Data Compression, Communications and Processing 2011
DOI: 10.1109/ccp.2011.40
|View full text |Cite
|
Sign up to set email alerts
|

An Online Algorithm for Lightweight Grammar-Based Compression

Abstract: Grammar-based compression is a well-studied technique to construct a context-free grammar (CFG) deriving a given text uniquely. In this work, we propose an online algorithm for grammar-based compression. Our algorithm guarantees O(log 2 n)-approximation ratio for the minimum grammar size, where n is an input size, and it runs in input linear time and output linear space. In addition, we propose a practical encoding, which transforms a restricted CFG into a more compact representation. Experimental results by c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2012
2012
2016
2016

Publication Types

Select...
4
3
1

Relationship

4
4

Authors

Journals

citations
Cited by 13 publications
(14 citation statements)
references
References 33 publications
(35 reference statements)
0
14
0
Order By: Relevance
“…Thus, with more work, our method will make it possible to detect plagiarism from large document collections and interesting patterns from biological sequences using a novel technique in [17]. An important future work is to improve our method to handle the stream data, which is partially developed by an online compression algorithm for grammar-based compression in [18] we recently proposed.…”
Section: Discussionmentioning
confidence: 99%
“…Thus, with more work, our method will make it possible to detect plagiarism from large document collections and interesting patterns from biological sequences using a novel technique in [17]. An important future work is to improve our method to handle the stream data, which is partially developed by an online compression algorithm for grammar-based compression in [18] we recently proposed.…”
Section: Discussionmentioning
confidence: 99%
“…These methods exploit repetitions in the text to derive good grammar rules, so they are particularly suitable for texts containing many identical substrings. Finding the smallest grammar for a given text is NP-hard [21], but there exist several grammar-based compressors that achieve O(log N ) approximation factors or less [71,73,59,47], where N is the text length. We use Re-Pair [52] as our grammar compressor.…”
Section: Data Compression and Codingmentioning
confidence: 99%
“…An important line of research in this regard are the online grammar construction algorithms [Maruyama et al 2012;Takabatake et al 2017].…”
Section: Construction and Dynamismmentioning
confidence: 99%