Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Syste 2020
DOI: 10.1145/3373376.3378471
|View full text |Cite
|
Sign up to set email alerts
|

Why GPUs are Slow at Executing NFAs and How to Make them Faster

Abstract: Non-deterministic Finite Automata (NFA) are space-efficient finite state machines that have significant applications in domains such as pattern matching and data analytics. In this paper, we investigate why the Graphics Processing Unit (GPU)-a massively parallel computational device with the highest memory bandwidth available on generalpurpose processors-cannot efficiently execute NFAs. First, we identify excessive data movement in the GPU memory hierarchy and describe how to privatize reads effectively using … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
1
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(9 citation statements)
references
References 51 publications
0
1
0
Order By: Relevance
“…Many works aim to exploit these parallelism degrees on top of a massively parallel architecture such as GPUs. We exclude closed-source approaches [17], [36], while selecting the approach proposed by Liu et al [31] which exposes several state-of-the-art GPU-based methodologies.…”
Section: Gpu-based Enginesmentioning
confidence: 99%
See 2 more Smart Citations
“…Many works aim to exploit these parallelism degrees on top of a massively parallel architecture such as GPUs. We exclude closed-source approaches [17], [36], while selecting the approach proposed by Liu et al [31] which exposes several state-of-the-art GPU-based methodologies.…”
Section: Gpu-based Enginesmentioning
confidence: 99%
“…We select the available engines as described in §II and represented in A of Figure 1. Indeed, YARB's engines currently are: RE2 [30], Hyperscan [14], and the ones presented by Liu et al [31]. RE2 is a C ++ general-purpose library that guarantees execution time linear in the input length and fixed stack footprint, able to target any CPU regardless the ISA.…”
Section: A Regular Expression Enginesmentioning
confidence: 99%
See 1 more Smart Citation
“…Our notion of counter-ambiguity is formulated more generally, and our simulation based on bit vectors handles character class ambiguity. Finally, there are several works that implement regex matching algorithms on GPUs [14,29,60,70].…”
Section: Related Workmentioning
confidence: 99%
“…Specialized approaches instead focus on a selected application and exploit the characteristics of it. Works on Non-deterministic finite automaton (NFA) propose to dynamically employ the GPU shared memory to store frequently used sizable lookup tables [91]. Many specialized works have focused on GPU execution of irregular Sparse Matrix Vector Multiplication (SpMV) and Matrix Matrix Multiplication (GEMM) by proposing software approaches that reorder the matrices dataset [119], algorithms tailored for specific data characteristics of the matrices [127], and row reordering techniques [69] to improve data locality among processed rows.…”
Section: Memory Divergencementioning
confidence: 99%