Malware authors introduced obfuscation techniques to existing malware in order to evade detection and hide its purposes. As a result, the number of malicious programs has grown in both volume and sophistication. Thus, effective categorization of malware based on its characteristics and behavior is required. In this paper, malicious software is visualized as gray scale images since its ability to capture minor changes while retaining the global structure helps to detect variations. Motivated by the visual similarity between malware samples of the same family, we propose a file agnostic deep learning approach for malware categorization to efficiently group malicious software into families based on a set of discriminant patterns extracted from their visualization as images. The suitability of our approach is evaluated against two benchmarks: the MalImg dataset and the BigData Innovators Gathering. Experimental comparison demonstrates its superior performance with respect to state-of-the-art techniques.
This study aims, firstly, to determine whether hotel categories worldwide can be inferred from features that are not taken into account by the institutions in charge of assigning such categories and, if so, to create a model to classify the properties offered by P2P accommodation platforms, similar to grading scheme categories for hotels, thus preventing opportunistic behaviours of information asymmetry and information overload. The characteristics of 33,000 hotels around the world and 18,000,000 reviews from Booking.com were collected automatically and, using the Support Vector Machine classification technique, we trained a model to assign a category to a given hotel. The results suggest that a hotel classification can usually be inferred by different criteria (number of reviews, price, score, and users' wish lists) that have nothing to do with the official criteria. Moreover, room prices are the most important feature for predicting the hotel category, followed by cleanliness and location.
Twitter has become a widely used social network to discuss ideas about many domains. This leads to a growing interest in understanding what are the major accepted or rejected opinions in different domains by social network users. At the same time, checking what are the topics that produce the most controversial discussions among users can be a good tool to discover topics that can be divisive, what can be useful, e.g., for policy makers. With the aim to automatically discover such information from Twitter discussions, we present an analysis system based on Valued Abstract Argumentation to model and reason about the accepted and rejected opinions. We consider different schemes to weight the opinions of Twitter users, such that we can tune the relevance of opinions considering different information sources from the social network. Towards having a fully automatic system, we also design a relation labeling system for discovering the relation between opinions. Regarding the underlying acceptability semantics, we use ideal semantics to compute accepted/rejected opinions. We define two measures over sets of accepted and rejected opinions to quantify the most controversial discussions. In order to validate our system, we analyze different real Twitter discussions from the political domain. The results show that different weighting schemes produce different sets of socially accepted opinions and that the controversy measures can reveal significant differences between discussions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.