CyBERT: Cybersecurity Claim Classification by Fine-Tuning the BERT Language Model

Ameri, Kimia; Hempel, Michael; Sharif, Hamid; López, Julio Pérez; Perumalla, Kalyan S.

doi:10.3390/jcp1040031

Cited by 31 publications

(25 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Among all mentioned language models, we chose BERT for our work because it is an open-source model with a very strong tokenizer and word-embedding matrix. In our previous work we fine-tune BERT with neural network to build cybersecurity claim sequence classifier CyBERT [8], and show that its design (BERT+NN) improves upon the performance obtainable from other language models such as GPT2 and ULMFiT. From Figure 1 it is readily apparent that due to its very early positioning within the overall processing workflow the overall accuracy of our vetting system is highly dependent on the accuracy obtained by the claims classifier.…”

Section: Related Workmentioning

confidence: 85%

“…For some NLP applications, however, a language model by itself is not sufficient for accomplishing a given downstream task, and it becomes necessary to expand the language model's overall architecture by stacking it with another form of neural network, for example using a convolutional neural network for language models targeting classification NLP tasks. For such application scenarios, the combination of the BERT language model and deep learning models such as recurrent neural networks or convolutional neural networks were shown to be effective in recent studies for capturing meaningful features from the available data [8,[38][39][40]. We utilize a similar approach for our ClaimsBERT classifier.…”

Section: Related Workmentioning

confidence: 99%

“…Using language models such as BERT for cybersecurity applications is a growing research area in recent years [8,[41][42][43][44][45][46][47][48][49]. Fine tuning language models such as BERT for cybersecurity domain tasks can provide many benefits to the cybersecurity community.…”

Section: Language Models In Cybersecuritymentioning

confidence: 99%

“…In another study, BERT was fine tuned on Android source code applications to identify and classify existing malware [41]. Fine-tuning BERT for classification tasks such as attack classification [48], cybersecurity claim classification [8], knowledge graph [51] and vulnerability classification [52]. ExBERT is another example of fine-tuning BERT for vulnerability exploitability prediction using sentence-level sentiment analysis [52].…”

Section: Language Models In Cybersecuritymentioning

confidence: 99%

“…Text from readable documents was extracted via the PyMuPDF python package, and we leveraged python's Pytesseract package for performing optical character recognition (OCR) with any scanned PDFs. With this approach, we managed to extract 2,160,517 sequences with 41,073,376 words across our curated dataset of ICS documents [8].…”

Section: Dataset Curationmentioning

confidence: 99%

See 4 more Smart Citations

An Accuracy-Maximization Approach for Claims Classifiers in Document Content Analytics for Cybersecurity

Ameri

Hempel

Sharif

et al. 2022

JCP

Self Cite

View full text Add to dashboard Cite

This paper presents our research approach and findings towards maximizing the accuracy of our classifier of feature claims for cybersecurity literature analytics, and introduces the resulting model ClaimsBERT. Its architecture, after extensive evaluations of different approaches, introduces a feature map concatenated with a Bidirectional Encoder Representation from Transformers (BERT) model. We discuss deployment of this new concept and the research insights that resulted in the selection of Convolution Neural Networks for its feature mapping aspects. We also present our results showing ClaimsBERT to outperform all other evaluated approaches. This new claims classifier represents an essential processing stage within our vetting framework aiming to improve the cybersecurity of industrial control systems (ICS). Furthermore, in order to maximize the accuracy of our new ClaimsBERT classifier, we propose an approach for optimal architecture selection and determination of optimized hyperparameters, in particular the best learning rate, number of convolutions, filter sizes, activation function, the number of dense layers, as well as the number of neurons and the drop-out rate for each layer. Fine-tuning these hyperparameters within our model led to an increase in classification accuracy from 76% obtained with BertForSequenceClassification’s original model to a 97% accuracy obtained with ClaimsBERT.

show abstract

Section: Related Workmentioning

confidence: 85%