2018
DOI: 10.1109/access.2018.2816039
|View full text |Cite
|
Sign up to set email alerts
|

An FPGA-Based Hardware Accelerator for Energy-Efficient Bitmap Index Creation

Abstract: Bitmap index is recognized as a promising candidate for online analytics processing systems, because it effectively supports not only parallel processing but also complex and multi-dimensional queries. However, bitmap index creation is a time-consuming task. In this paper, by taking full advantage of massive parallel computing of field-programmable gate array (FPGA), two hardware accelerators of bitmap index creation, namely BIC64K8 and BIC32K16, are originally proposed. Each of the accelerator contains two pr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 20 publications
0
5
0
Order By: Relevance
“…This is 37.12% less then the previous implemented system based on software pattern recognition algorithms [26]. Hardware accelerators, containing content addressable memory and index 8-bit and 16-bit in parallel with each clock cycle, were implemented in [27]. This system consumes as low as 6.76% and 3.28% of energy compared to CPU and GPU-based designs respectively.…”
Section: Introductionmentioning
confidence: 93%
“…This is 37.12% less then the previous implemented system based on software pattern recognition algorithms [26]. Hardware accelerators, containing content addressable memory and index 8-bit and 16-bit in parallel with each clock cycle, were implemented in [27]. This system consumes as low as 6.76% and 3.28% of energy compared to CPU and GPU-based designs respectively.…”
Section: Introductionmentioning
confidence: 93%
“…With W = 4 and Rule = 4, we need 2 W =24 =16 registers and number of bits of each register = number of rules = 4. Thus, decoding the Basic Rule table in Table 2 we need a RAM of size 2 4 x4 = 16x4 (16 registers with each register having a width of 4 bits). When applying principle 1 for writing to RAM, we have the results shown in Table 3.…”
Section: Basic Ideasmentioning
confidence: 99%
“…With priority encoding, output of TCAM is the highest prioritized result. TCAM is widely uses in networking routers such as translations-look-aside buffers (TLBs) [3] caches in microprocessors, database accelerators in big data analytics [4] and in pattern recognition [5].…”
Section: Introductionmentioning
confidence: 99%
“…Conventional CAM (on ASIC) has the drawback of high power consumption, limited storage density, non-scalability, and high implementation cost [3,13], compared with a random-access memory (RAM) which has high storage density and lower power consumption. One CAM cell requires 9-10 transistors, while one RAM cell needs only 6 transistors which makes conventional CAM power inefficient and limits its storage density.…”
Section: Introductionmentioning
confidence: 99%