Single-linkage is one of the algorithms in agglomerative clustering technique that can be used to detect outliers. The single-linkage algorithm combines two clusters with the closest pair of observations. Then, the clusters are combined into larger clusters, until all the observations are formed in the same cluster. In this study, a single-linkage algorithm method that utilised a circular distance based on the City-block distance as the similarity distance is used. The performance of the method in detecting multiple outliers for a circular regression model is tested via simulation studies with three different outlier scenarios which are outliers in u-space only, v-space only and both uv-space. The performance is measured by calculating the "success" probability (pout), masking error (pmask) and swamping error (pswamp) for both outlier scenarios. It is found that the single linkage method performed well in detecting outliers for both outlier scenarios and applicable for circular regression model.
This paper is a comparative study of several algorithms for detecting multiple outliers in circular-circular regression model based on the clustering algorithms. Three measures of similarity based on the circular distance were used to obtain a cluster tree using the agglomerative hierarchical methods. A stopping rule for the cluster tree based on the mean direction and circular standard deviation of the tree height was used as the cutoff point and classifier to the cluster group that exceeded the stopping rule as potential outliers. The performances of the algorithms have been demonstrated using the simulation studies that consider several outlier scenarios with a certain degree of contamination. Application to real data using wind data and a simulated data set are given for illustrative purposes. Thus, it has been found that Satari’s algorithm (S-SL algorithm) performs well for any values of sample size n and error concentration parameter. The algorithms are good in identifying outliers which are not limited to one or few outliers only, but the presence of multiple outliers at one time.
The existence of outliers in circular-circular regression model can lead to many errors, for example in inferences and parameter estimations. Therefore, this study aims to develop new algorithms that can detect outliers by using minimum spanning tree method. The proposed method is examined via simulation study with different number of sample sizes and level of contaminations. Then, the performance of the proposed method was measured using “success” probability, masking effect, and swamping effect. The results revealed that the proposed method were performed well and able to detect all the outliers planted in various conditions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.