While developing continuous authentication systems (CAS), we generally assume that samples from both genuine and impostor classes are readily available. However, the assumption may not be true in certain circumstances.Therefore, we explore the possibility of implementing CAS using only genuine samples. Specifically, we investigate the usefulness of four one-class classifiers OCC (elliptic envelope, isolation forest, local outliers factor, and one-class support vector machines) and their fusion. The performance of these classifiers was evaluated on four distinct behavioral biometric datasets, and compared with eight multi-class classifiers (M CC). The results demonstrate that if we have sufficient training data from the genuine user the OCC, and their fusion can closely match the performance of the majority of M CC. Our findings encourage the research community to use OCC in order to build CAS as they do not require knowledge of impostor class during the enrollment process.
In this paper, we propose a novel continuous authentication system for smartphone users. The proposed system entirely relies on unlabeled phone movement patterns collected through smartphone accelerometer. The data was collected in a completely unconstrained environment over five to twelve days. The contexts of phone usage were identified using k-means clustering. Multiple profiles, one for each context, were created for every user. Five machine learning algorithms were employed for classification of genuine and impostors. The performance of the system was evaluated over a diverse population of 57 users. The mean equal error rates achieved by Logistic Regression, Neural Network, kNN, SVM, and Random Forest were 13.7%, 13.5%, 12.1%, 10.7%, and 5.6% respectively. A series of statistical tests were conducted to compare the performance of the classifiers. The suitability of the proposed system for different types of users was also investigated using the failure to enroll policy. 1
A novel similarity-based feature selection algorithm is developed, using the concept of distance correlation. A feature subset is selected in terms of this similarity measure between pairs of features, without assuming any underlying distribution of the data. The pair-wise similarity is then employed, in a message passing framework, to select a set of exemplars features involving minimum redundancy and reduced parameter tuning. The algorithm does not need an exhaustive traversal of the search space. The methodology is next extended to handle large data, using an inherent property of distance correlation. The effectiveness of the algorithm is demonstrated on nine sets of publicly-available data.
Network anomalies can arise due to various causes such as abnormal behaviors from users, malfunctioning network devices, malicious activities performed by attackers, malicious software or botnets. With the emergence of machine learning and especially deep learning, many works in the literature developed learning models that are able to detect network anomalies. However, these models require massive amounts of labeled data for model training and may not be able to detect unknown anomalous traffic or zeroday attacks. Unsupervised learning techniques such as autoencoder and its variants do not require labeled data but their performance is still poor. Generative adversarial networks (GANs) have successfully demonstrated their capability of implicitly learning data distributions of arbitrarily complex dimensions. This motivates us to carry out an empirical study on the capability of GANs in network anomaly detection. We adopt two existing GAN models and develop new neural networks for their components, i.e., generator and discriminator. We carry out extensive experiments to evaluate the performance of GANs and compare with existing unsupervised detection techniques. We use multiple datasets that include both realistic traffic captures (PCAP) and synthetic traffic generated by simulation platforms. We develop a traffic aggregation technique to extract statistical features that are useful for the models to learn traffic behaviors. The experimental results show that GANs outperform the existing techniques with a significant improvement in different performance metrics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.