2020
DOI: 10.3390/app10103356
|View full text |Cite
|
Sign up to set email alerts
|

Data Reduction in the String Space for Efficient kNN Classification Through Space Partitioning

Abstract: Within the Pattern Recognition field, two representations are generally considered for encoding the data: statistical codifications, which describe elements as feature vectors, and structural representations, which encode elements as high-level symbolic data structures such as strings, trees or graphs. While the vast majority of classifiers are capable of addressing statistical spaces, only some particular methods are suitable for structural representations. The kNN classifier constitutes one of the scarce exa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 26 publications
0
6
0
Order By: Relevance
“…Not only was the RHC algorithm was found to be much faster than RSP3, but it also was one of the fastest approaches that took part in this experimental study [10]. A modified version of the RHC algorithm has recently been applied on string data spaces [11,12].…”
Section: Related Workmentioning
confidence: 96%
“…Not only was the RHC algorithm was found to be much faster than RSP3, but it also was one of the fastest approaches that took part in this experimental study [10]. A modified version of the RHC algorithm has recently been applied on string data spaces [11,12].…”
Section: Related Workmentioning
confidence: 96%
“…This RHC method was recently adapted to string-based representations of Valero-Mas and Castellanos (2020). In this adaptation, the prototype generation stage of the RHC algorithm required the computation of the median value of a set of strings.…”
Section: Background In Data Reductionmentioning
confidence: 99%
“…In this line, a recent work by Valero-Mas and Castellanos (2020) proposed the implementation of the wellknown PG method reduction through homogeneous clusters (RHC) (Ougiaroglou and Evangelidis 2016) for kNN-based classification, in which data is represented as strings. This approach is based on recursively dividing the initial corpus into homogeneous clusters in order to then replace each of them with a representative prototype generated as the median element of the cluster.…”
Section: Introductionmentioning
confidence: 99%
“…While this process may be seen as a trivial task, performing such a comparison is a remarkably time-consuming process, especially if the data structure consid-30 ered is somewhat complex. Thus, given that distance-based classifiers require the computation of the dissimilarity between the input query and every single element of the training data, the efficiency is considerably low [9].…”
Section: Introductionmentioning
confidence: 99%