Drive-by downloads have become the primary attack vehicle for malware distribution in recent years. With the rise of targeted attacks, the vulnerabilities within the cloud based services and web based collaboration frameworks might end up as the principal targets for hosting drive-by download attacks. In this paper, we studied the similarity of the shellcodes among different attack kits. Shellcode is the malicious code used as the payload in drive-by download attacks. Specifically, we collected 15 different drive-by download attack kits and identified shellcodes used in each kit. As the shellcodes are transmitted to the browser as Javascript strings, we measured the similarity between regular strings and shellcodes defined in Javascript. We disassembled the shellcodes and computed the mean of Cosine Similarity, Extended Jaccard Similarity and Pearson Correlation measures based on the frequencies of the opcodes.Our analysis shows that the shellcodes, used as payloads, across different attack kits were similar with other shellcodes and dissimilar with benign Javascript strings. We observe that some of the attack kits released across different years had same shellcodes. The performance of similarity analysis was compared to an emulation based approach and observed reduction of 75% in the analysis time. Based on the results, the similarity measure of the shellcodes could be an effective static mechanism in detecting the shellcode based drive-by download attacks.
Malware distribution using drive-by download attacks has become the most prominent threat for organizations and individuals. Compromised web services and web applications hosted on the cloud act as the delivery medium for the exploits. The exploits included often target the vulnerabilities within the plugins of the web browsers. Implementing security controls to counter the exploits within the browsers for ensuring end point security has become a challenge.In this paper, a set of features is proposed and is extracted by monitoring the communications between the browser and the plugins during the rendering of webpages. The Support Vector Machines are trained using the defined features and the performance of the trained classifier is evaluated using a dataset with both malicious and benign use cases of the plugins. The dataset included 10,239 malicious use cases and 37,369 benign use cases. To compensate the imbalance in the distribution of the dataset, experiments were performed using weighted costs and oversampling. Our analysis shows that the Support Vector Machines trained by using the proposed set of features classified with an average accuracy of about 99.4%. On integrating the proposed approach as an inline defense, an average performance overhead of 5.14% was observed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.