2013
DOI: 10.5120/10043-4627
|View full text |Cite
|
Sign up to set email alerts
|

An Optimized Approach of Modified BAT Algorithm to Record Deduplication

Abstract: The task of recognizing, in a data warehouse, records that pass on to the identical real world entity despite misspelling words, kinds, special writing styles or even unusual schema versions or data types is called as the record deduplication. In existing research they offered a genetic programming (GP) approach to record deduplication. Their approach combines several different parts of substantiation extracted from the data content to generate a deduplication purpose that is capable to recognize whether two o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(8 citation statements)
references
References 11 publications
0
8
0
Order By: Relevance
“…A set of several files of varying sizes and formats is considered to be the data set for assessment. For assessing the performance of deduplication of records in databases, Precision, F-measure and Recall are the parameters used for comparison [7] [11]. In addition to these parameters, time taken to identify the duplicates can also be considered to make the comparison more vivid and effective.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…A set of several files of varying sizes and formats is considered to be the data set for assessment. For assessing the performance of deduplication of records in databases, Precision, F-measure and Recall are the parameters used for comparison [7] [11]. In addition to these parameters, time taken to identify the duplicates can also be considered to make the comparison more vivid and effective.…”
Section: Resultsmentioning
confidence: 99%
“…The alphabet Σ of probable characters ch gives the set of all probable sequences. ED can be computed using dynamic programming [11]. This algorithm is capable to finding the minimum value required for insertion, deletion and substitution of a character to a word in order to obtain the original word.…”
Section: B Levenshtein's Algorithmmentioning
confidence: 99%
“…In some situations, it is necessary to switch to exploitation stage, we have to vary the loudness Ai and the rate ri of pulse emission during the iterations. Since the loudness usually decreases once a bat has found its prey, while the rate of pulse emission increases, the loudness can be chosen as any value of convenience, between Amin and Amax, assuming that a bat has just found the prey by stop emitting sound [5]. BAT algorithm uses echolocation and frequency tuning to solve problems [1].…”
Section: Proposed Methodologymentioning
confidence: 99%
“…The metaheuristic algorithms can be summed up in optimization problem in the in the field of business, artificial intelligence, engineering technology and other applications have been derived from the social behaviour of biological systems (animals). Bat algorithm (BA) is a novel feature; biological algorithm developed by Xin-She Yang in 2010 [18], depending on the echolocation behavior of bats and uses a frequency-tuning technique with varying loudness [19][20][21][22][23][24]. Initial population of the bats is performed randomly generated from real-valued vectors.…”
Section: Implementation Of Bat Algorithm Optimizationmentioning
confidence: 99%