2016
DOI: 10.1016/j.is.2015.08.008
|View full text |Cite
|
Sign up to set email alerts
|

Practical compressed string dictionaries

Abstract: The need to store and query a set of strings -a string dictionary -arises in many kinds of applications.While classically these string dictionaries have accounted for a small share of the total space budget (e.g., in Natural Language Processing or when indexing text collections), recent applications in Web engines, Semantic Web (RDF) graphs, Bioinformatics, and many others, handle very large string dictionaries, whose size is a significant fraction of the whole data. In these cases, string dictionary managemen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
46
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 45 publications
(46 citation statements)
references
References 74 publications
0
46
0
Order By: Relevance
“…12,[45][46][47][48] With respect to DA dictionaries, BASE and CHECK use many bits since they are pointer-based arrays. Therefore, we will investigate efficient compression methods for BASE and CHECK in dynamic DAs.…”
Section: Discussionmentioning
confidence: 99%
“…12,[45][46][47][48] With respect to DA dictionaries, BASE and CHECK use many bits since they are pointer-based arrays. Therefore, we will investigate efficient compression methods for BASE and CHECK in dynamic DAs.…”
Section: Discussionmentioning
confidence: 99%
“…Both datasets have been obtained from the Webgraph framework [3]. The UK dataset has been used in previous work as a baseline for URL compression [1,7,14]. The Arabic dataset is included for better confirmation of the performance of each solution in different Web graphs.…”
Section: Methodsmentioning
confidence: 99%
“…• PFC, RPFC and RPHTFC are some of the differential encoding techniques based on Front-coding [14] described in Section 2. PFC is the plain solution, RPFC uses Re-Pair to compress buckets.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…This significantly reduces the number of comparisons to be made on binary searches, at the price of a small space overhead. These indexes are implemented using compressed string dictionaries [44].…”
Section: A2 Slp-based Self Indexesmentioning
confidence: 99%