2021
DOI: 10.1016/j.knosys.2021.107256
|View full text |Cite
|
Sign up to set email alerts
|

SDCOR: Scalable density-based clustering for local outlier detection in massive-scale datasets

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 120 publications
0
3
0
Order By: Relevance
“…The abnormal data refer to outliers with unreasonable values in the dataset, which has the characteristic that the proportion in the whole dataset is usually small and deviates from the whole. Commonly used abnormal data detection algorithms include clustering-based outlier detection [36][37][38], density-based outlier detection [39,40], and so on. In this paper, we choose the isolated forest algorithm [41], which divides the dataset by constructing a binary tree, expresses the degree of alienation from the data subject according to the depth of the data samples in the binary tree, and finally divides the anomalous data by the anomaly score.…”
Section: Abnormal Data Handlingmentioning
confidence: 99%
“…The abnormal data refer to outliers with unreasonable values in the dataset, which has the characteristic that the proportion in the whole dataset is usually small and deviates from the whole. Commonly used abnormal data detection algorithms include clustering-based outlier detection [36][37][38], density-based outlier detection [39,40], and so on. In this paper, we choose the isolated forest algorithm [41], which divides the dataset by constructing a binary tree, expresses the degree of alienation from the data subject according to the depth of the data samples in the binary tree, and finally divides the anomalous data by the anomaly score.…”
Section: Abnormal Data Handlingmentioning
confidence: 99%
“…On the other hand, the distance-based techniques include the k-Nearest Neighbor [150] and the Clustering k-Means [151]. These methods assume tightly grouping, as clusters, for normal data, but different data are located far respect to their nearest neighbors.…”
Section: Techniques That Could Be Possible Potential Solutions To The...mentioning
confidence: 99%
“…Clustering is a fundamental technique in data mining and machine learning, aiming to group objects into distinct clusters [ 1 7 ]. Objects within a cluster show high similarity to each other and low similarity to objects in other clusters, determined by a similarity measure [ 8 11 ].…”
Section: Introductionmentioning
confidence: 99%