2018
DOI: 10.14419/ijet.v7i2.27.12084
|View full text |Cite
|
Sign up to set email alerts
|

Combination of levenshtein distance and rabin-karp to improve the accuracy of document equivalence level

Abstract: Rabin Karp algorithm is a search algorithm that searches for a substring pattern in a text using hashing. It is beneficial for matching words with many patterns. One of the practical applications of Rabin Karp's algorithm is in the detection of plagiarism. Michael O. Rabin and Richard M. Karp invented the algorithm. This algorithm performs string search by using a hash function. A hash function is the values that are compared between two documents to determine the level of similarity of the document. Rabin-Kar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
2
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 25 publications
(4 citation statements)
references
References 18 publications
0
2
0
Order By: Relevance
“…where đť‘š is the number of similar characters, đť‘  1 is the length of string-1, đť‘  2 is the length of string-2, and t is the number of transpositions. The Rabin-Karp algorithm utilizes the hash method for performing multiple searches [31]. The steps involved in Rabin-Karp include: i) removing punctuation marks from the document and converting the text to lowercase for the search; ii) dividing the texts into grams with a predefined k-gram value; iii) calculating the hash value using the rolling hash function for each gram, following the formula: h=c1*bk-1|c2*bk-2 |...| ck-1*b| ck; iv) identifying matching hash values between two texts; and v) determining the similarity between two pieces of text using Dice's similarity coefficient equation.…”
Section: Jaro Winkler Distance Versus Rabin-karpmentioning
confidence: 99%
“…where đť‘š is the number of similar characters, đť‘  1 is the length of string-1, đť‘  2 is the length of string-2, and t is the number of transpositions. The Rabin-Karp algorithm utilizes the hash method for performing multiple searches [31]. The steps involved in Rabin-Karp include: i) removing punctuation marks from the document and converting the text to lowercase for the search; ii) dividing the texts into grams with a predefined k-gram value; iii) calculating the hash value using the rolling hash function for each gram, following the formula: h=c1*bk-1|c2*bk-2 |...| ck-1*b| ck; iv) identifying matching hash values between two texts; and v) determining the similarity between two pieces of text using Dice's similarity coefficient equation.…”
Section: Jaro Winkler Distance Versus Rabin-karpmentioning
confidence: 99%
“…A hashing-based string-matching algorithm known as Rabin-Karp (RK) was developed in 1987 [34]. Tis algorithm uses the hashing approach to identify patterns within a text [35]. Lecroq introduced the Hash-q algorithm, which calculates a hash value between 0 and 255 for each q-gram in the pattern p [36,37].…”
Section: Literature Reviewmentioning
confidence: 99%
“…The Rabin Karp algorithm is used for string matching and has advantages in the simple string matching process. This algorithm uses hashing to find a collection of string patterns in a text [24]. This research the Rabin Karp Algorithm use to guarantee of the data consistency in the blockchain process.…”
Section: Rabin Karp Algorithmmentioning
confidence: 99%