NoRBERT: Transfer Learning for Requirements Classification

Hey, Tobias; Keim, Jan; Koziolek, Anne; Tichy, Walter F.

doi:10.1109/re48521.2020.00028

Cited by 79 publications

(104 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Despite the abundance of such studies involving SRS documents, which showcase the better performance of deep learning architectures over the traditional classifiers, only a few studies consider transfer learning-based approaches in the software requirement domain. Recently, Hey, Keim, Koziolek, and Tichy [25] studied the problem of classifying functional and nonfunctional SRS documents using transfer learning-based approaches such as NoRBERT. They concluded that NoRBERT improves the prediction performance considerably.…”

Section: Literature Reviewmentioning

confidence: 99%

A BERT-based transfer learning approach to text classification on software requirements specifications

Kıcı

Malik

Çevik

et al. 2021

Proceedings of the Canadian Conference on Artificial Intelligence

View full text Add to dashboard Cite

In a software development life cycle, software requirements specifications (SRS) written in an incomprehensible language might hinder the success of the project in later stages. In such cases, the subjective and ambiguous nature of the natural languages can be considered as a cause for the failure of the final product. Redundancy and/or controversial information in the SRS documents might also result in additional costs and time loss, reducing the overall efficiency of the project. With the recent advances in machine learning, there is an increased effort to develop automated solutions for a seamless SRS design. However, most vanilla machine learning approaches ignore the semantics of the software artifacts or integrating domain-specific knowledge into the underlying natural language processing tasks, and therefore tend to generate inaccurate results. With such concerns in mind, we consider a transfer learning approach in our study, which is based on an existing pre-trained language model called DistilBERT. We specifically examine the DistilBERT's ability in multi-class text classification on SRS data using various finetuning methods, and compare its performance with other deep learning methods such as LSTM and BiLSTM. We test the performance of these models using two datasets: DOORS Next Generation dataset and PROMISE-NFR dataset. Our numerical results demonstrate that DistilBERT perform well for various text classification tasks over the SRS datasets and shows significant promise to be used for automating the software development processes.

show abstract

Section: Literature Reviewmentioning

confidence: 99%

A BERT-based transfer learning approach to text classification on software requirements specifications

Kıcı

Malik

Çevik

et al. 2021

Proceedings of the Canadian Conference on Artificial Intelligence

View full text Add to dashboard Cite

show abstract

“…In [20], the authors experimented with different classifications: Classifying SRs into FRs or NFRs, classifying NFRs into different categories (usability, security, operational, and performance), and classifying FRs into the categories of functions, data, and behavior. The model was a tuned (BERT) model named "NoRBERT".…”

Section: Complete System (Classifying Nfrs and Frs Into Multi-classes)mentioning

confidence: 99%

“…Finally, padding was used to unify the length of the sentences, since they had different numbers of words. This was carried out by finding the maximum length of the sentence and then adding zeros to the end of the sequence of tokens for any sentence that was shorter than the maximum length specified according to the input [20]. Algorithm 1 summarizes the preprocessing and Figure 3 gives an example of preprocessing of one SR from the dataset.…”

Section: Data Preprocessingmentioning

confidence: 99%

“…However, in [19], the authors started by binary classification of SRs, recording a 0.94 F1 score. Then, NFRs were classified using a smaller number of classes, as used in [20], achieving a 0.91 F1 score, but NFRs were also classified into security-related or non-security-related, with a 0.77 F1 score. In [17], the authors classified NFRs into 11 classes, scoring a 76% accuracy.…”

Section: Comparative Analysismentioning

confidence: 99%

“…In [17], the authors classified NFRs into 11 classes, scoring a 76% accuracy. Only [20] provided a complete classification system. SRs were classified into FRs or NFRs with an average F1 score of 91.5%.…”

Section: Comparative Analysismentioning

confidence: 99%

See 2 more Smart Citations

One- and Two-Phase Software Requirement Classification Using Ensemble Deep Learning

Rahimi

Eassa²,

Elrefaei

2021

Entropy

View full text Add to dashboard Cite

Recently, deep learning (DL) has been utilized successfully in different fields, achieving remarkable results. Thus, there is a noticeable focus on DL approaches to automate software engineering (SE) tasks such as maintenance, requirement extraction, and classification. An advanced utilization of DL is the ensemble approach, which aims to reduce error rates and learning time and improve performance. In this research, three ensemble approaches were applied: accuracy as a weight ensemble, mean ensemble, and accuracy per class as a weight ensemble with a combination of four different DL models—long short-term memory (LSTM), bidirectional long short-term memory (BiLSTM), a gated recurrent unit (GRU), and a convolutional neural network (CNN)—in order to classify the software requirement (SR) specification, the binary classification of SRs into functional requirement (FRs) or non-functional requirements (NFRs), and the multi-label classification of both FRs and NFRs into further experimental classes. The models were trained and tested on the PROMISE dataset. A one-phase classification system was developed to classify SRs directly into one of the 17 multi-classes of FRs and NFRs. In addition, a two-phase classification system was developed to classify SRs first into FRs or NFRs and to pass the output to the second phase of multi-class classification to 17 classes. The experimental results demonstrated that the proposed classification systems can lead to a competitive classification performance compared to the state-of-the-art methods. The two-phase classification system proved its robustness against the one-phase classification system, as it obtained a 95.7% accuracy in the binary classification phase and a 93.4% accuracy in the second phase of NFR and FR multi-class classification.

show abstract

SABDM: A self‐attention based bidirectional‐RNN deep model for requirements classification

Kaur

2022

J Software Evolu Process

View full text Add to dashboard Cite

The success of software depends upon functional and non‐functional requirements as both requirements are equally important in software development. However, the requirements engineering community still lacks in comprehensive understanding of functional and non‐functional requirements. In addition, the requirements in software documents are expressed in natural language and also intertwined with each other. Requirements classification is a crucial task that correctly extracts functional and non‐functional requirements and organizes them in specified categories. Automated classification of software requirements leads to reduced ambiguity, misunderstanding, and development cost. Most of the recent studies have used machine learning and deep learning techniques for automatic classification of requirements. However, there is one drawback of such techniques, that is, poor generalization. To address these problems, this research work proposes self‐attention based bidirectional LSTM deep model. This automated approach has used recurrent neural network, which handle long sequential natural language requirements statements and classify them into five classes such as capability, maintainability, performance, security, and usability. The proposed approach train and evaluate over pre‐labeled dataset comprised of 34 industrial requirements specifications and PROMISE dataset. Over this dataset, the proposed approach yields 95% of precision, 96% of recall, 96% of F‐measure, and 96% of accuracy. The proposed approach can be applied to wide variety of datasets with different domain. Furthermore, this paper applies pre‐processing techniques to improve the performance of the requirements classification model. The results of the proposed model are compared with existing baseline state‐of‐art techniques, and it is shown that the proposed model outperforms the baseline models in requirements classification.

show abstract

NoRBERT: Transfer Learning for Requirements Classification

Cited by 79 publications

References 32 publications

A BERT-based transfer learning approach to text classification on software requirements specifications

A BERT-based transfer learning approach to text classification on software requirements specifications

One- and Two-Phase Software Requirement Classification Using Ensemble Deep Learning

SABDM: A self‐attention based bidirectional‐RNN deep model for requirements classification

Contact Info

Product

Resources

About