In this paper, we propose a new approach to identify anomalous behaviour based on heterogeneous data and a data fusion technique. There are four types of datasets applied in this study including credit card, loyalty card, GPS, and image data. The first step of the complete framework in this proposed study is to identify the best features for every dataset. Then, the new anomaly detection technique which is recently introduced and known as empirical data analytics (EDA) is applied to detect the abnormal behaviour based on the datasets. Standardised eccentricity (a newly introduced within EDA measure offering a new simplified form of the well-known Chebyshev inequality) can be applied to any data distribution. Image data are processed using pre-trained deep learning network, and classification is done by using support vector machine. Most of the other data used in our previous work are of type "signal"/real number (e.g. credit card, loyalty card and GPS data). However, a clear conclusion that a misuse was made very often cannot be reached based on them only. When gender or age is different from the expected, it is obvious misuse. At the final stage of the proposed method is combining anomaly result and image recognition using data fusion technique. From the experiment results, this proposed technique may simplify the tedious job in the real complex cases of forensic investigation. The proposed technique is using heterogeneous data which combine all the data from the VAST Challenge as well as image data using an introduced data fusion technique. These can assist the human expert in processing huge amount of heterogeneous data to detect anomalies. In future research, text data can also be used as a part of heterogeneous data mixture, and the data fusion technique may be applied to other datasets.
Analyzing and predicting the high frequency trading (HFT) financial data stream is very challenging due to the fast arrival times and large amount of the data samples. Aiming at solving this problem, an online evolving fuzzy rulebased prediction model is proposed in this paper. Because this prediction model is based on evolving fuzzy rule-based systems and a novel, simpler form of data density, it can autonomously learn from the live data stream, automatically build/remove its rules and recursively update the parameters. This model responds quickly to all unpredictable sudden changes of financial data and readjusts itself to follow the new data pattern. Experimental results show the excellent prediction performance of the proposed approach with real financial data stream regardless of quick shifts of data patterns and frequent appearances of abnormal data samples.
In this paper, we introduce an approach to classify gender and age from images of human faces which is an essential part of our method for autonomous detection of anomalous human behaviour. Human behaviour is often uncertain, and sometimes it is affected by emotion or environment. Automatic detection can help to recognise human behaviour which later can assist in investigating suspicious events. Central to our proposed approach is the recently introduced transfer learning. It was used on the basis of deep learning and successfully applied to image classification area. This paper is a continuous study from previous research on heterogeneous data in which we use images as supporting evidence. We present a method for image classification based on a pretrained deep model for feature extraction and representation followed by a Support Vector Machine classifier. Because very few data sets with labels of gender and age exist of face images, we build one dataset named GAFace and applied our proposed method to this dataset achieving excellent results and robustness (gender classification: 90.33% and age classification: 80.17% accuracy) approaching human performance.
Abstract. In this paper, we propose a method to detect anomalous behaviour using heterogenous data. This method detects anomalies based on the recently introduced approach known as Recursive Density Estimation (RDE) and the so called eccentricity. This method does not require prior assumptions to be made on the type of the data distribution. A simplified form of the well-known Chebyshev condition (inequality) is used for the standardised eccentricity and it applies to any type of distribution. This method is applied to three datasets which include credit card, loyalty card and GPS data. Experimental results show that the proposed method may simplify the complex real cases of forensic investigation which require processing huge amount of heterogeneous data to find anomalies. The proposed method can simplify the tedious job of processing the data and assist the human expert in making important decisions. In our future research, more data will be applied such as natural language (e.g. email, Twitter, SMS) and images.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.