Jun Zhang scite author profile

Machine learning based solutions have been successfully employed for automatic detection of malware in Android applications. However, as is known, machine learning models lack robustness to adversarial examples, which are crafted by adding minor, yet carefully chosen, perturbations to the normal inputs. So far, the adversarial examples can only deceive Android malware detectors that rely on syntactic features (e.g., requested permissions, specific API calls, etc.), and the perturbations can only be implemented by simply modifying Android manifest. While recent Android malware detectors rely more on semantic features from Dalvik bytecode rather than manifest, existing attacking/defending methods are no longer effective due to the rising challenge in adding perturbations to Dalvik bytecode without affecting their original functionality.In this paper, we introduce a new highly-effective attack that generates adversarial examples of Android malware and evades being detected by the current models. To this end, we propose a method of applying optimal perturbations onto Android APK using a substitute model (i.e., a Deep Neural Network). Based on the transferability concept, the perturbations that successfully deceive the substitute model are likely to deceive the original models as well (e.g., Support Vector Machine in Drebin or Random Forest in MaMaDroid). We develop an automated tool to generate the adversarial examples without human intervention to apply the attacks. In contrast to existing works, the adversarial examples crafted by our method can also deceive recent machine learning based detectors that rely on semantic features such as control-flow-graph. The perturbations can also be implemented directly onto APK's Dalvik bytecode rather than Android manifest to evade from recent detectors. We evaluated the proposed manipulation methods for adversarial examples by using the same datasets that Drebin and MaMadroid (5879 malware examples) used. Our results show that, the malware detection rates decreased from 96% to 1% in MaMaDroid, and from 97% to 1% in Drebin, with just a small distortion generated by our adversarial examples manipulation method.

show abstract

Cross-Project Transfer Representation Learning for Vulnerable Function Discovery

Lin

Zhang

Luo

et al. 2018

IEEE Trans. Ind. Inf.

155

120

View full text Add to dashboard Cite

Network Traffic Classification Using Correlation Information

Zhang

Xiang

Wang

et al. 2013

IEEE Trans. Parallel Distrib. Syst.

288

114

View full text Add to dashboard Cite

Traffic classification has wide applications in network management, from security monitoring to quality of service measurements. Recent research tends to apply machine learning techniques to flow statistical feature based classification methods. The nearest neighbor (NN)-based method has exhibited superior classification performance. It also has several important advantages, such as no requirements of training procedure, no risk of overfitting of parameters, and naturally being able to handle a huge number of classes. However, the performance of NN classifier can be severely affected if the size of training data is small. In this paper, we propose a novel nonparametric approach for traffic classification, which can improve the classification performance effectively by incorporating correlated information into the classification process. We analyze the new classification approach and its performance benefit from both theoretical and empirical perspectives. A large number of experiments are carried out on two real-world traffic data sets to validate the proposed approach. The results show the traffic classification performance can be improved significantly even under the extreme difficult circumstance of very few training samples.

show abstract

Software Vulnerability Detection Using Deep Neural Networks: A Survey

et al. 2020

View full text Add to dashboard Cite

Software Vulnerability Discovery via Learning Multi-Domain Knowledge Bases

Lin

Zhang

Luo

et al. 2021

IEEE Trans. Dependable and Secure Comput.

View full text Add to dashboard Cite

6 million spam tweets: A large ground truth for timely Twitter spam detection

Chen

Zhang

Chen

et al. 2015

View full text Add to dashboard Cite

Twitter has changed the way of communication and getting news for people's daily life in recent years. Meanwhile, due to the popularity of Twitter, it also becomes a main target for spamming activities. In order to stop spammers, Twitter is using Google SafeBrowsing to detect and block spam links.Despite that blacklists can block malicious URLs embedded in tweets, their lagging time hinders the ability to protect users in real-time. Thus, researchers begin to apply different machine learning algorithms to detect Twitter spam. However, there is no comprehensive evaluation on each algorithms' performance for real-time Twitter spam detection due to the lack of large ground truth. To carry out a thorough evaluation, we collected a large dataset of over 600 million public tweets. We further labelled around 6.5 million spam tweets and extracted 12 light weight features, which can be used for online detection. In addition, we have conducted a number of experiments on six machine learning algorithms under various conditions to better understand their effectiveness and weakness for timely Twitter spam detection. We will make our labelled dataset for researchers who are interested in validating or extending our work.

show abstract

Secure buyer–seller watermarking protocol

Zhang

Kou

Fan

2006

IEE Proc. Inf. Secur.

View full text Add to dashboard Cite

In the existing watermarking protocols, a trusted third party (TTP) is introduced to guarantee that a protocol is fair to both the seller and buyer in a digital content transaction. However, the TTP decreases the security and affects the protocol implementation. To address this issue, in this article a secure buyer-seller watermarking protocol without the assistance of a TTP is proposed in which there are only two participants, a seller and a buyer. Based on the idea of sharing a secret, a watermark embedded in digital content to trace piracy is composed of two pieces of secret information, one produced by the seller and one by the buyer. Since neither knows the exact watermark, the buyer cannot remove the watermark from watermarked digital content, and at the same time the seller cannot fabricate piracy to frame an innocent buyer. In other words, the proposed protocol can trace piracy and protect the customer's rights. In addition, because no third party is introduced into the proposed protocol, the problem of a seller (or a buyer) colluding with a third party to cheat the buyer (or the seller), namely, the conspiracy problem, can be avoided.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jun Zhang

Robust Network Traffic Classification

Android HIV: A Study of Repackaging Malware for Evading Machine-Learning Detection

Cross-Project Transfer Representation Learning for Vulnerable Function Discovery

Network Traffic Classification Using Correlation Information

Software Vulnerability Detection Using Deep Neural Networks: A Survey

Software Vulnerability Discovery via Learning Multi-Domain Knowledge Bases

6 million spam tweets: A large ground truth for timely Twitter spam detection

Secure buyer–seller watermarking protocol

Contact Info

Product

Resources

About