2016
DOI: 10.1109/tkde.2016.2562627
|View full text |Cite
|
Sign up to set email alerts
|

K Nearest Neighbour Joins for Big Data on MapReduce: A Theoretical and Experimental Analysis

Abstract: International audienceGiven a point p and a set of points S, the kNN operation finds the k closest points to p in S. It is a computational intensive task with a large range of applications such as knowledge discovery or data mining. However, as the volume and the dimension of data increase, only distributed approaches can perform such costly operation in a reasonable time. Recent works have focused on implementing efficient solutions using the MapReduce programming model because it is suitable for distributed … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 41 publications
(13 citation statements)
references
References 29 publications
0
13
0
Order By: Relevance
“…In the first step of our method, we build a KNN-based method to extract overlapping regions included in the original dataset. As a geometric classifier, the KNN classifier determines the class membership of a data point from its distance to reference data points [39], and it is widely used in clustering [40], [41], [42]. Thus, KNN is a density-based method to classify the input data by the percentage of samples belonging to a different class in their nearest neighbors.…”
Section: A the Extraction Of Overlapping Areamentioning
confidence: 99%
“…In the first step of our method, we build a KNN-based method to extract overlapping regions included in the original dataset. As a geometric classifier, the KNN classifier determines the class membership of a data point from its distance to reference data points [39], and it is widely used in clustering [40], [41], [42]. Thus, KNN is a density-based method to classify the input data by the percentage of samples belonging to a different class in their nearest neighbors.…”
Section: A the Extraction Of Overlapping Areamentioning
confidence: 99%
“…Recently, much attention has been paid to the cooperative broadcast, where the message recovery is offloaded to the nodes in a cooperative manner [22,23]. For instance, a cooperative beacon broadcast scheme is proposed in [24] to provide the vehicles with more traffic information when driving on roads.…”
Section: Related Workmentioning
confidence: 99%
“…Authors of [14] compare and analyse five different existing methods to deduce the strengths and weaknesses of the KNN classification scheme for big data. As evaluation, [14] presents the advantages and disadvantages of the different stages of the compared classification models which are all applied on MapReduce work-flow.…”
Section: Some Recent Literature On Classificationmentioning
confidence: 99%
“…As evaluation, [14] presents the advantages and disadvantages of the different stages of the compared classification models which are all applied on MapReduce work-flow. It is claimed in [14] that the results achieved in the study can be used to tackle different practical KNN problems in the context of big data.…”
Section: Some Recent Literature On Classificationmentioning
confidence: 99%