The correct classification of requirements has become an essential task within software engineering. This study shows a comparison among the text feature extraction techniques, and machine learning algorithms to the problem of requirements engineer classification to answer the two major questions “Which works best (Bag of Words (BoW) vs. Term Frequency–Inverse Document Frequency (TF-IDF) vs. Chi Squared (CHI2)) for classifying Software Requirements into Functional Requirements (FR) and Non-Functional Requirements (NF), and the sub-classes of Non-Functional Requirements?” and “Which Machine Learning Algorithm provides the best performance for the requirements classification task?”. The data used to perform the research was the PROMISE_exp, a recently made dataset that expands the already known PROMISE repository, a repository that contains labeled software requirements. All the documents from the database were cleaned with a set of normalization steps and the two feature extractions, and feature selection techniques used were BoW, TF-IDF and CHI2 respectively. The algorithms used for classification were Logist Regression (LR), Support Vector Machine (SVM), Multinomial Naive Bayes (MNB) and k-Nearest Neighbors (kNN). The novelty of our work is the data used to perform the experiment, the details of the steps used to reproduce the classification, and the comparison between BoW, TF-IDF and CHI2 for this repository not having been covered by other studies. This work will serve as a reference for the software engineering community and will help other researchers to understand the requirement classification process. We noticed that the use of TF-IDF followed by the use of LR had a better classification result to differentiate requirements, with an F-measure of 0.91 in binary classification (tying with SVM in that case), 0.74 in NF classification and 0.78 in general classification. As future work we intend to compare more algorithms and new forms to improve the precision of our models.
The adoption of cloud computing solutions is an established reality in government agencies and in small, medium, and large companies due to procurement easiness and the variety of available services, as well as its low cost compared to the acquisition and management of own infrastructures. Among the most used services is cloud file storage, and the security of this storage has been an essential subject of recent research, particularly customer data integrity. Thus, this article proposes a solution for the monitoring of the integrity of files stored in the cloud, based on the use of smart contracts in Blockchain Networks, symmetric encryption, and computational trust. The proposed solution consists of a protocol that provides confidentiality, decentralization, audit availability, and the secure sharing of file integrity monitoring results, without overloading the services involved, as well as an unabridged reference implementation which was used to validate the proposal. The results obtained during the validation tests have shown that the solution is feasible and faultless in detecting corrupted files. These tests also confirmed that the sharing of integrity monitoring results, coupled with the application of computational trust techniques, significantly increased the efficiency of the proposed solution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.