2019
DOI: 10.1007/s12559-019-09646-y
|View full text |Cite
|
Sign up to set email alerts
|

A Novel Algorithm for Online Inexact String Matching and its FPGA Implementation

Abstract: Accelerating inexact string matching procedures is of utmost importance when dealing with practical applications where huge amount of data must be processed in real time, as usual in bioinformatics or cybersecurity. Inexact matching procedures can yield multiple shadow hits, which must be filtered, according to some criterion, to obtain a concise and meaningful list of occurrences. The filtering procedures are often computationally demanding and are performed offline in a post-processing phase. This paper intr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
14
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
3

Relationship

2
7

Authors

Journals

citations
Cited by 23 publications
(17 citation statements)
references
References 84 publications
0
14
0
Order By: Relevance
“…One of the algorithms used in ASM is measuring the Levenshtein distance between strings. Two strings can be said to match if their dissimilarity (or Levenshtein distance) is below a predetermined threshold [10]. The Levenshtein distance between two strings is the minimum number of character-level edits required to change one word into the other.…”
Section: Approximate String Matchingmentioning
confidence: 99%
See 1 more Smart Citation
“…One of the algorithms used in ASM is measuring the Levenshtein distance between strings. Two strings can be said to match if their dissimilarity (or Levenshtein distance) is below a predetermined threshold [10]. The Levenshtein distance between two strings is the minimum number of character-level edits required to change one word into the other.…”
Section: Approximate String Matchingmentioning
confidence: 99%
“…Regular expressions were chosen as a means to help identify candidate values for a field as the values for each key share a common structure but cannot be expressed in a finite dictionary. Key-value matching becomes more complicated where there are multiple string values in the OCR output file that conform to a specific regular expression pattern [10]. The solution to this was to use the field name position on the document and to match the first successful pattern that was located closest to the field name.…”
Section: Introductionmentioning
confidence: 99%
“…In order to deal with such domains, five mainstream approaches can be pursued [ 10 ]: Feature generation and/or feature engineering, where numerical features are extracted ad-hoc from structured patterns (e.g., using their properties or via measurements) and can be further merged according to different strategies (e.g., in a multi-modal way [ 11 ]); Ad-hoc dissimilarities in the input space, where custom dissimilarity measures are designed in order to process structured patterns directly in the input domain without moving towards Euclidean (or metric) spaces. Common—possibly parametric—edit distances include the Levenshtein distance [ 12 ] for sequence domains and graph edit distances [ 13 ] for graphs domains; Embedding via information granulation and granular computing [ 3 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 ]; Dissimilarity representations [ 26 , 27 , 28 ], where structured patterns are embedded in the Euclidean space according to their pairwise dissimilarities; Kernel methods, where the mapping between the original input space and the Euclidean space exploits positive-definite kernel functions [ 29 , 30 , 31 , 32 , 33 ]. …”
Section: Introductionmentioning
confidence: 99%
“…In these approaches the choice of the similarity measure is essential for their performance. Hence, in recent years a diverse collection of similarity measures has been proposed in the literature to further enrich the existing portfolio (Parziale et al, 2018;Cinti et al, 2017;Oregi et al, 2017a;Zhou and De la Torre, 2016;Zhao and Itti, 2018).…”
Section: Introductionmentioning
confidence: 99%