Organizations face the issue of how to best allocate their security resources. Thus, they need an accurate method for assessing how many new vulnerabilities will be reported for the operating systems (OSs) and web browsers they use in a given time period. Our approach consists of clustering vulnerabilities by leveraging the text information within vulnerability records, and then simulating the mean value function of vulnerabilities by relaxing the monotonic intensity function assumption, which is prevalent among the studies that use software reliability models (SRMs) and nonhomogeneous Poisson process in modeling. We applied our approach to the vulnerabilities of four OSs (Windows, Mac, IOS, and Linux) and four web browsers (Internet Explorer, Safari, Firefox, and Chrome). Out of the total eight OSs and web browsers we analyzed using a power-law model issued from a family of SRMs, the model was statistically adequate for modeling in six cases. For these cases, in terms of estimation and forecasting capability, our results, compared to a power-law model without clustering, are more accurate in all cases but one.
In this paper, we introduce an approach for predicting the cumulative number of software vulnerabilities that is in most cases more accurate than vulnerability discovery models (VDMs). Our approach uses a neural network model (NNM) to model the nonlinearities associated with vulnerability disclosure. Nine common VDMs were used to compare their prediction capability with our approach. The different models were applied to vulnerabilities associated with eight well-known software (four operating systems and four web browsers). The models were assessed in terms of prediction accuracy and prediction bias. Out of eight software we analyzed, the NNM outperformed the VDMs in all the cases in terms of prediction accuracy, and provided smaller values of absolute average bias in seven cases. This study shows that NNMs are promising for accurate predictions of software vulnerabilities disclosures.
Abstract-Organizations face the issue of how to best allocate their security resources. Thus, they need an accurate method for assessing how many new vulnerabilities will be reported for the operating systems (OSs) they use in a given time period. Our approach consists of clustering vulnerabilities by leveraging the text information within vulnerability records, and then simulating the mean value function of vulnerabilities by relaxing the monotonic intensity function assumption, which is prevalent among the studies that use software reliability models (SRMs) and nonhomogeneous Poisson process (NHPP) in modeling. We applied our approach to the vulnerabilities of four OSs: Windows, Mac, IOS, and Linux. For the OSs analyzed in terms of curve fitting and prediction capability, our results, compared to a power-law model without clustering issued from a family of SRMs, are more accurate in all cases we analyzed.
Vulnerabilities with publically known exploits typically form 2-7% of all vulnerabilities reported for a given software version. With a smaller number of known exploited vulnerabilities compared with the total number of vulnerabilities, it is more difficult to model and predict when a vulnerability with a known exploit will be reported. In this paper, we introduce an approach for predicting the discovery pattern of publically known exploited vulnerabilities using all publically known vulnerabilities reported for a given software. Eight commonly used vulnerability discovery models (VDMs) and one neural network model (NNM) were utilized to evaluate the prediction capability of our approach. We compared their predictions results with the scenario when only exploited vulnerabilities were used for prediction. Our results show that, in terms of prediction accuracy, out of eight software we analyzed, our approach led to more accurate results in seven cases. Only in one case, the accuracy of our approach was worse by 1.6%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.