Software security is a very important aspect for software development organizations who wish to provide high-quality and dependable software to their consumers. A crucial part of software security is the early detection of software vulnerabilities. Vulnerability prediction is a mechanism that facilitates the identification (and, in turn, the mitigation) of vulnerabilities early enough during the software development cycle. The scientific community has recently focused a lot of attention on developing Deep Learning models using text mining techniques for predicting the existence of vulnerabilities in software components. However, there are also studies that examine whether the utilization of statically extracted software metrics can lead to adequate Vulnerability Prediction Models. In this paper, both software metrics- and text mining-based Vulnerability Prediction Models are constructed and compared. A combination of software metrics and text tokens using deep-learning models is examined as well in order to investigate if a combined model can lead to more accurate vulnerability prediction. For the purposes of the present study, a vulnerability dataset containing vulnerabilities from real-world software products is utilized and extended. The results of our analysis indicate that text mining-based models outperform software metrics-based models with respect to their F2-score, whereas enriching the text mining-based models with software metrics was not found to provide any added value to their predictive performance.
The COVID-19 outbreak, also known as the coronavirus pandemic, has left its mark on every aspect of our lives and at the time of this writing is still an ongoing battle. Beyond the immediate global-wide health response, the pandemic has triggered a significant number of IT initiatives to track, visualize, analyze and potentially mitigate the phenomenon. For individuals or organizations interested in developing COVID-19 related software, knowledge-sharing communities such as Stack Overflow proved to be an effective source of information for tackling commonly encountered problems. As an additional contribution to the investigation of this unprecedented health crisis and to assess how fast and how well the community of developers has responded, we performed a study on COVID-19 related posts in Stack Overflow. In particular, we profiled relevant questions based on key post features and their evolution, identified the most prominent technologies adopted for developing COVID-19 software and their interrelations and focused on the most persevering problems faced by developers. For the analysis of posts we employed descriptive statistics, Association Rule Graphs, Survival Analysis and Latent Dirichlet Allocation. The results reveal that the response of the developers’ community to the pandemic was immediate and that the interest of developers on COVID-19 related challenges was sustained after its initial peak. In terms of the problems addressed, the results show a clear focus on COVID-19 data collection, analysis and visualization from/to the web, in line with the general needs for monitoring the pandemic.
Smart Contracts (SC) are computer programs that run on blockchains and can be executed automatically in a deterministic way, when pre-determined conditions are met. Currently, Ethereum is the biggest blockchain network with more than 200,000 SCs deployed every month[1](#fn-0002). The main mechanism for financially managing and securing such networks is “ Gas Consumption”. In particular, a gas cost is assigned to each operation that alters the blockchain state, based on the SC size and complexity. Thus, the cost that a SC incurs to its owner and users is related to the internal structure of the SC. By considering that the average cost for deploying a Smart Contract can reach up to thousands of euros, it becomes obvious that internal quality of SCs is of great importance. To this end, in this article we present a comprehensive analysis of the correlation of a set of code metrics (e.g., size, complexity) with the actual gas required to deploy Smart Contracts. The empirical evidence that we provide rely on the analysis of over 90,000 SCs. In addition to the produced empirical knowledge, in most of the cases validating the theoretical expectation, we have implemented a web-based application (Smart Contracts Quality Analysis Platform—SCQAP) that visualizes the findings, enabling the on-demand creation of correlation diagrams, and offers access to a public repository of our data (metrics and deployment gas consumption) via a REST API. To the best of our knowledge this is the biggest empirical study on SCs, which: (a) sets up the scene for further large-scale studies on Smart Contracts (through tooling and public dataset); and (b) provides guidance to software practitioners on parameters that can inflate deployment costs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.