In the darknet, hackers are constantly sharing information with each other and learning from each other. These conversations in online forums for example can contain data that may help assist in the discovery of cyber threat intelligence. Cyber Threat Intelligence (CTI) is information or knowledge about threats that can help prevent security breaches in cyberspace. In addition, monitoring and analysis of this data manually is challenging because forum posts and other data on the darknet are high in volume and unstructured. This paper uses descriptive analytics and predicative analytics using machine learning on forum posts dataset from darknet to discover valuable cyber threat intelligence. The IBM Watson Analytics and WEKA machine learning tool were used. Watson Analytics showed trends and relationships in the data. WEKA provided machine learning models to classify the type of exploits targeted by hackers from the form posts. The results showed that Crypter, Password cracker and RATs (Remote Administration Tools), buffer overflow exploit tools, and Keylogger system exploits tools were the most common in the darknet and that there are influential authors who are frequent in the forums. In addition, machine learning helps build classifiers for exploit types. The Random Forest classifier provided a higher accuracy than the Random Tree and Naïve Bayes classifiers. Therefore, analyzing darknet forum posts can provide actionable information as well as machine learning is effective in building classifiers for prediction of exploit types. Predicting exploit types as well as knowing patterns and trends on hackers’ plan helps defend the cyberspace proactively.
An analysis of the process and human cognitive model of deception detection (DD) shows that DD is infused with uncertainty, especially in high-stake situations. There is a recent trend toward automating DD in computer-mediated communication. However, extant approaches to automatic DD overlook the importance of representation and reasoning under uncertainty in DD. They represent uncertain cues as crisp values and can only infer whether deception occurs, but not to what extent deception occurs. Based on uncertainty theories and the analyses of uncertainty in DD, we propose a model to represent cues and to reason for DD under uncertainty, and address the uncertainty due to imprecision and vagueness in DD using fuzzy sets and fuzzy logic. Neuro-fuzzy models were developed to discover knowledge for DD. The evaluation results on five data sets showed that the neuro-fuzzy method not only was a good alternative to traditional machine-learning techniques but also offered superior interpretability and reliability. Moreover, the gains of neuro-fuzzy systems over traditional systems became larger as the level of uncertainty associated with DD increased. The findings of this paper have theoretical, methodological, and practical implications to DD and fuzzy systems research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.