The voluminous malware variants that appear in the Internet have posed severe threats to its security. In this work, we explore techniques that can automatically classify malware variants into their corresponding families. We present a generic framework that extracts structural information from malware programs as attributed function call graphs, in which rich malware features are encoded as attributes at the function level. Our framework further learns discriminant malware distance metrics that evaluate the similarity between the attributed function call graphs of two malware programs. To combine various types of malware attributes, our method adaptively learns the confidence level associated with the classification capability of each attribute type and then adopts an ensemble of classifiers for automated malware classification. We evaluate our approach with a number of Windows-based malware instances belonging to 11 families, and experimental results show that our automated malware classification method is able to achieve high classification accuracy.
Photo privacy is a very important problem in the digital age where photos are commonly shared on social networking sites and mobile devices. The main challenge in photo privacy detection is how to generate discriminant features to accurately detect privacy at risk photos. Existing photo privacy detection works, which rely on low-level vision features, are non-informative to the users regarding what privacy information is leaked from their photos. In this paper, we propose a new framework called Privacy-CNH that utilizes hierarchical features which include both object and convolutional features in a deep learning model to detect privacy at risk photos. The generation of object features enables our model to better inform the users about the reason why a photo has privacy risk. The combination of convolutional and object features provide a richer model to understand photo privacy from different aspects, thus improving photo privacy detection accuracy. Experimental results demonstrate that the proposed model outperforms the state-of-the-art work and the standard convolutional neural network (CNN) with low-level features on photo privacy detection tasks.
Recent years have witnessed a rapid adoption of mobile devices and a dramatic proliferation of mobile applications (Apps for brevity). However, the large number of mobile Apps makes it difficult for users to locate relevant Apps. Therefore, recommending Apps becomes an urgent task. Traditional recommendation approaches focus on learning the interest of a user and the functionality of an item (e.g., an App) from a set of user-item ratings, and they recommend an item to a user if the item's functionality well matches the user's interest. However, Apps could have privileges to access a user's sensitive resources (e.g., contact, message, and location). As a result, a user chooses an App not only because of its functionality, but also because it respects the user's privacy preference.To the best of our knowledge, this paper presents the first systematic study on incorporating both interest-functionality interactions and users' privacy preferences to perform personalized App recommendations. Specifically, we first construct a new model to capture the trade-off between functionality and user privacy preference. Then we crawled a real-world dataset (16, 344 users, 6, 157 Apps, and 263, 054 ratings) from Google Play and use it to comprehensively evaluate our model and previous methods. We find that our method consistently and substantially outperforms the state-of-the-art approaches, which implies the importance of user privacy preference on personalized App recommendations. Moreover, we explore the impact of different levels of privacy information on the performances of our method, which gives us insights on what resources are more likely to be treated as private by users and influence users' behaviors at selecting Apps.
String extraction and matching techniques have been widely used in generating signatures for worm detection, but how to generate effective worm signatures in an adversarial environment still remains a challenging problem. For example, attackers can freely manipulate byte distributions within the attack payloads and thus inject well-crafted noisy packets to contaminate the suspicious flow pool. To address these attacks, we propose SAS, a novel Semantics Aware Statistical algorithm for automatic signature generation. When SAS processes packets in a suspicious flow pool, it uses data flow analysis techniques to remove non-critical bytes. We then apply a hidden Markov model (HMM) to the refined data to generate state-transition-graph-based signatures. To our best knowledge, this is the first work combining semantic analysis with statistical analysis to automatically generate worm signatures. Our experiments show that the proposed technique can accurately detect worms with concise signatures. Moreover, our results indicate that SAS is more robust to the byte distribution changes and noise injection attacks compared to Polygraph and Hamsa.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.