The state of the cyberspace portends uncertainty for the future Internet and its accelerated number of users. New paradigms add more concerns with big data collected through device sensors divulging large amounts of information, which can be used for targeted attacks. Though a plethora of extant approaches, models and algorithms have provided the basis for cyberattack predictions, there is the need to consider new models and algorithms, which are based on data representations other than task-specific techniques. Deep learning, which is underpinned by representation learning, has found widespread relevance in computer vision, speech recognition, natural language processing, audio recognition, and drug design. However, its non-linear information processing architecture can be adapted towards learning the different data representations of network traffic to classify benign and malicious network packets. In this paper, we model cyberattack prediction as a classification problem. Furthermore, the deep learning architecture was co-opted into a new model using rectified linear units (ReLU) as the activation function in the hidden layers of a deep feed forward neural network. Our approach achieves a greedy layer-by-layer learning process that best represents the features useful for predicting cyberattacks in a dataset of benign and malign traffic. The underlying algorithm of the model also performs feature selection, dimensionality reduction, and clustering at the initial stage, to generate a set of input vectors called hyper-features. The model is evaluated using CICIDS2017 and UNSW_NB15 datasets on a Python environment test bed. Results obtained from experimentation show that our model demonstrates superior performance over similar models.
The expanding threat landscape has come with a plethora of consequences for most organizations and individuals. This is witnessed in the high volume of cyber-attacks prevalent in the cyberspace. Though
Phishing attacks are still very rampant and do not show signs of ever stopping. According to Santander Bank Customer Service, reports of phishing attacks have doubled each year since 2001. This work is based on identifying phishing Uniform Resource Locators (URLs). It focuses on preventing the issue of phishing attacks and detecting phishing URLs by using a total of 8 distinctive features that are extracted from the URLs. The sample size of study is 96,018 URLs. A total of four supervised machine learning algorithms: Naive Bayes Classifier, Support Vector Machine, Decision Tree and Random Forest were used to train the model and evaluate which of the algorithms performs better. Based on the analysis and evaluation, Random Forest performs best with an accuracy of 84.57% on the validation data set. The uniqueness of this work is in the choice of the selected features considered for the implementation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.