Decoding billions of integers per second through vectorization

Lemire, Daniel; Boytsov, Leonid

doi:10.1002/spe.2203

Cited by 214 publications

(184 citation statements)

References 51 publications

(141 reference statements)

Supporting

Mentioning

183

Contrasting

Order By: Relevance

“…Yan et al [9] analysed the compression ratio and decompression speed of many schemes while dealing with the compression of doc ids and term frequencies extracted from the GOV2 corpus using the AOL query log dataset. A similar work has been done by Lemire et al [10], studying the codecs behaviour for the ClueWeb09 corpus, but focusing just on document ids. While dealing with smaller datasets, Delbru et al [11] provided a comparison of codecs integrating these into an actual search engine with positional information.…”

Section: Introductionmentioning

confidence: 90%

“…In the Rice codec [22], b is a power of two, which means that bitwise operators can be exploited, permitting more efficient implementations at the cost of a small increase in the size of the compressed data. Nevertheless, Golomb and Rice coding are well-known for their decompression inefficiency [7,10,23], and hence, we omit experiments using these codecs from our work. Simple family: This family of codecs, firstly described in [23], stores as many integers as possible in a single word.…”

Section: List-adaptive Codecsmentioning

confidence: 99%

“…OptPFD [9] works similarly, but chooses b to optimise the compression ratio and the decompression speed. FastPFOR [10], instead, reserves 32 different areas of its output to store exceptions. Each area contains those exceptions that can be encoded using the same number of bits.…”

Section: List-adaptive Codecsmentioning

confidence: 99%

“…However, for small postings lists (and the remainder of each posting list) with less than 1024 elements, block sizes can be smaller, but no less than 128 elements [9]. For blocks smaller than 128, (P)FOR codecs have to fall back to other codecs: we use variable byte as per [10]. Table 3.…”

Section: Compression Codecsmentioning

confidence: 99%

“…Moreover, they did not consider other posting payload information, such as field frequencies and positions, nor did they investigate how codecs behave on different corpora. On the other hand, while Lemire et al [10] did investigate compression codecs on both the GOV2 and ClueWeb09 corpora, they were focused only on docid compression, and did not analyse the impact on search engine efficiency in terms of query response time. Finally, while Delbru et al [11] integrated different compression codecs into an actual IR system with document ids, term frequencies and positional information, their experiments with synthetically generated queries do not necessarily reflect a realistic search engine workload.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

On Inverted Index Compression for Search Engine Efficiency

Catena

Macdonald

Ounis

2014

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Efficient access to the inverted index data structure is a key aspect for a search engine to achieve fast response times to users' queries. While the performance of an information retrieval (IR) system can be enhanced through the compression of its posting lists, there is little recent work in the literature that thoroughly compares and analyses the performance of modern integer compression schemes across different types of posting information (document ids, frequencies, positions). In this paper, we experiment with different modern integer compression algorithms, integrating these into a modern IR system. Through comprehensive experiments conducted on two large, widely used document corpora and large query sets, our results show the benefit of compression for different types of posting information to the space-and time-efficiency of the search engine. Overall, we find that the simple Frame of Reference compression scheme results in the best query response times for all types of posting information. Moreover, we observe that the frequency and position posting information in Web corpora that have large volumes of anchor text are more challenging to compress, yet compression is beneficial in reducing average query response times.

show abstract

Section: Introductionmentioning

confidence: 90%

Section: List-adaptive Codecsmentioning

confidence: 99%

Section: List-adaptive Codecsmentioning

confidence: 99%

Section: Compression Codecsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

On Inverted Index Compression for Search Engine Efficiency

Catena

Macdonald

Ounis

2014

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

Better bitmap performance with Roaring bitmaps

et al. 2015

View full text Add to dashboard Cite

SUMMARYBitmap indexes are commonly used in databases and search engines. By exploiting bit-level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus we might prefer compressed bitmap indexes. Following Oracle's lead, bitmaps are often compressed using run-length encoding (RLE). Building on prior work, we introduce the Roaring compressed bitmap format: it uses packed arrays for compression instead of RLE. We compare it to two high-performance RLE-based bitmap encoding techniques: WAH (Word Aligned Hybrid compression scheme) and Concise (Compressed 'n' Composable Integer Set). On synthetic and real data, we find that Roaring bitmaps (1) often compress significantly better (e.g., 2×) and (2) are faster than the compressed alternatives (up to 900× faster for intersections). Our results challenge the view that RLE-based bitmap compression is best.

show abstract

SIMD compression and the intersection of sorted integers

Lemire

Boytsov

Kurz³

2015

Softw Pract Exp

View full text Add to dashboard Cite

Sorted lists of integers are commonly used in inverted indexes and database systems. They are often compressed in memory. We can use the single-instruction, multiple data (SIMD) instructions available in common processors to boost the speed of integer compression schemes. Our S4-BP128-D4 scheme uses as little as 0.7 CPU cycles per decoded 32-bit integer while still providing state-of-the-art compression. However, if the subsequent processing of the integers is slow, the effort spent on optimizing decompression speed can be wasted. To show that it does not have to be so, we (1) vectorize and optimize the intersection of posting lists; (2) introduce the SIMD GALLOPING algorithm. We exploit the fact that one SIMD instruction can compare four pairs of 32-bit integers at once. We experiment with two Text REtrieval Conference (TREC) text collections, GOV2 and ClueWeb09 (category B), using logs from the TREC million-query track. We show that using only the SIMD instructions ubiquitous in all modern CPUs, our techniques for conjunctive queries can double the speed of a state-of-the-art approach.

show abstract

Decoding billions of integers per second through vectorization

Cited by 214 publications

References 51 publications

On Inverted Index Compression for Search Engine Efficiency

On Inverted Index Compression for Search Engine Efficiency

Better bitmap performance with Roaring bitmaps

SIMD compression and the intersection of sorted integers

Contact Info

Product

Resources

About