Android application (app) stores contain a huge number of apps, which are manually classified based on the apps’ descriptions into various categories. However, the predefined categories or apps descriptions are usually not very accurate to reflect the real functionalities of apps, thereby leading to misclassify the apps, which may cause serious security issues and unreliability problem in the app store. Therefore, the automatic app classification is an important demand to construct a secure, reliable, integrated, and easy to navigate app store. In this paper, we propose an effective method called AndroClass to automatically classify apps based on their real functionalities by using rich and comprehensive features representing the actual functionalities of the apps. AndroClass performs three steps of feature extraction, feature refinement, and classification. In the feature extraction step, we extract 14 various features for each app by utilizing a unified tool suite. In the feature refinement step, we apply Random Forest algorithm to refine the features. In the classification step, we combine refined features into a single one and AndroClass is equipped with K-Nearest Neighbor, Naive Bayes, Support Vector Machine, and Deep Neural Network to classify apps. On the contrary to the existing methods, all the utilized features in AndroClass are stable and clearly represent the actual functionalities of the app, AndroClass does not pose any issues to the user privacy, and our method can be applied to classify unreleased or newly released apps. The results of extensive experiments with two real-world datasets and a dataset constructed by human experts demonstrate the effectiveness of AndroClass where the classification accuracy of AndroClass with the latter dataset is 83.5%.
Measuring similarity between a suspicious app and the existing ones in the app
store is one of theexisting mechanisms to make the app stores healthy and sustainable
against pirated apps, repackagedapps, and malware. To improve the efficiency of
similarity computation between apps, it is essentialto reduce the number of apps to be
compared with the suspicious one; the app classification is one ofthe methods to achieve
this by detecting a possible category for the suspicious app and measuring thesimilarity
between the suspicious app and only the apps in that category. In this paper, we
proposea technique for app classification by applying support vector machine (SVM). We
extract API callsas a feature for classification from the apps bytecode, consider two
candidate features, classes andmethods related to API calls, and carefully analyze which
of them is more beneficial for app classification.In order to evaluate our technique, we
conduct extensive experiments with a real-worlddataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.