A method for outlier detection based on cluster analysis and visual expert criteria

Lara, Juan A.; Lizcano, David; Rampérez, Víctor; Soriano, Javier

doi:10.1111/exsy.12473

Cited by 16 publications

(6 citation statements)

References 46 publications

(60 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…K-means algorithm has the advantages of relatively simple principle, easy implementation, high convergence speed and strong interpretability [9] . The basic idea of this paper is to divide the sample points into several clusters according to the distance between the sample points for a given sample data set, so that the distance between the sample points in the cluster is as close as possible, and the distance between the sample points in the cluster is as far as possible.…”

Section: K-means Clustering Algorithmmentioning

confidence: 99%

Mining of association rules between students’ behavior and academic achievements

Ding

Chen

et al. 2023

International Conference on Cyber Security, Artificial Intelligence, and Digital Economy (CSAIDE 2023)

View full text Add to dashboard Cite

With the continuous development of educational informatization, students have produced a large amount of data in the process of learning and life. The educational management system with simple search and query function can not find the potential value behind large-scale data and can not meet the practical needs of teaching management. In view of the above problems, in order to mine useful information from student behavior data, this paper collected and arranged student behavior data and achievement data, constructed an index system to describe student behavior, clustered student behavior data with K-means clustering algorithm, and generated student behavior characteristic description data. Apriori algorithm was used for association mining of feature data sets. Based on the mining results, the association rules between students' behavior and grades were analyzed, which provides a reference for students' behavior guidance and teaching management.

show abstract

Section: K-means Clustering Algorithmmentioning

confidence: 99%

Mining of association rules between students’ behavior and academic achievements

Ding

Chen

et al. 2023

International Conference on Cyber Security, Artificial Intelligence, and Digital Economy (CSAIDE 2023)

View full text Add to dashboard Cite

show abstract

“…The Clustering-based method used in this study creates clusters of the outlier scores using hierarchical clustering, classifying objects within clusters as "normal" and objects outside as "outliers" [42].…”

Section: Thresholdingmentioning

confidence: 99%

Generic Diagnostic Framework for Anomaly Detection—Application in Satellite and Spacecraft Systems

Bieber,

Verhagen,

Cosson

et al. 2023

Aerospace

View full text Add to dashboard Cite

Spacecraft systems collect health-related data continuously, which can give an indication of the systems’ health status. While they rarely occur, the repercussions of such system anomalies, faults, or failures can be severe, safety-critical and costly. Therefore, the data are used to anticipate any kind of anomalous behaviour. Typically this is performed by the use of simple thresholds or statistical techniques. Over the past few years, however, data-driven anomaly detection methods have been further developed and improved. They can help to automate the process of anomaly detection. However, it usually is time intensive and requires expertise to identify and implement suitable anomaly detection methods for specific systems, which is often not feasible for application at scale, for instance, when considering a satellite consisting of numerous systems and many more subsystems. To address this limitation, a generic diagnostic framework is proposed that identifies optimal anomaly detection techniques and data pre-processing and thresholding methods. The framework is applied to two publicly available spacecraft datasets and a real-life satellite dataset provided by the European Space Agency. The results show that the framework is robust and adaptive to different system data, providing a quick way to assess anomaly detection for the underlying system. It was found that including thresholding techniques significantly influences the quality of resulting anomaly detection models. With this, the framework can provide both a way forward in developing data-driven anomaly detection methods for spacecraft systems and guidance relative to the direction of anomaly detection method selection and implementation for specific use cases.

show abstract

“…Approaches. In the clustering-based outlier detection approaches [36,37], it is necessary to define and calculate the distance or similarity metric between two data instances; then, based on the metric, the data instances that are far away from their closest cluster centroid or where their density is below a threshold are declared as outliers. e k-means is one of the most wellknown clustering-based algorithms; it has been widely used in outlier detection since its simplicity and efficiency.…”

Section: Clustering-based Outlier Detectionmentioning

confidence: 99%

An Efficient Outlier Detection Approach for Streaming Sensor Data Based on Neighbor Difference and Clustering

Cai

Chen

Yin

et al. 2022

Security and Communication Networks

View full text Add to dashboard Cite

In wireless sensor networks (WSNs), the widely distributed sensors make the real-time processing of data face severe challenges, which prompts the use of edge computing. However, some problems that occur during the operation of sensors will cause unreliability of the collected data, which can result in inaccurate results of edge computing-based processing; thus, it is necessary to detect potential abnormal data (also known as outliers) in the sensor data to ensure their quality. Although the clustering-based outlier detection approaches can detect outliers from the static data, the feature of streaming sensor data requires the detection operation in a one-pass fashion; in addition, the clustering-based approaches also do not consider the time correlation among the streaming sensor data, which leads to its low detection accuracy. To solve these problems, we propose an efficient outlier detection approach based on neighbor difference and clustering, namely, ODNDC, which not only quickly and accurately detects outliers but also identifies the source of outliers in the streaming sensor data. Experiments on a synthetic dataset and a real dataset show that the proposed ODNDC approach achieves great performance in detecting outliers and identifying their sources, as well as the low time consumption.

show abstract

A method for outlier detection based on cluster analysis and visual expert criteria

Cited by 16 publications

References 46 publications

Mining of association rules between students’ behavior and academic achievements

Mining of association rules between students’ behavior and academic achievements

Generic Diagnostic Framework for Anomaly Detection—Application in Satellite and Spacecraft Systems

An Efficient Outlier Detection Approach for Streaming Sensor Data Based on Neighbor Difference and Clustering

Contact Info

Product

Resources

About