The distributed denial of service (DDoS) attack is one of the most server threats to the current Internet and brings huge losses to society. Furthermore, it is challenging to defend DDoS due to the case that the DDoS traffic can appear similar to the legitimate ones. Router throttling is an accessible approach to defend DDoS attacks. Some existing router throttling methods dynamically adjust a given threshold value to keep the server load safe. However, these methods are not ideal as they exploit the information of the current time, so the perception of time series variations is poor. The DDoS problem can be seen as a Markov decision process (MDP). Multi-agent router throttling (MART) method based on hierarchical communication mechanism has been proposed to address this problem. However, each agent is independent with each other and has no communication among them, therefore, it is hard for them to collaborate to learn an ideal policy to defend DDoS. To solve this multi-agent partially observable MDP problem, we propose a centralized reinforcement learning router throttling method based on a centralized communication mechanism. Each router sends its own traffic reading to a central router, the central router then makes a decision for each router to choose the throttling rate. We also simulate the environment of the DDoS problem more realistic while modify the reward function of the MART to make the reward function of more coherent. To decrease the communication costs, we add a deep deterministic policy gradient network for each router to decide whether or not to send information to the central agent. The experiments validate that our proposed new smart router throttling method outperforms existing methods to the DDoS instruction response.INDEX TERMS Distributed denial of service, router throttling, Markov decision process, multi-agent router throttling, hierarchical communication, centralized communication, communication costs.
The explosive growth of malware variants poses a continuously and deeply evolving challenge to information security. Traditional malware detection methods require a lot of manpower. However, machine learning has played an important role on malware classification and detection, and it is easily spoofed by malware disguising to be benign software by employing self-protection techniques, which leads to poor performance for existing techniques based on the machine learning method. In this paper, we analyze the local maliciousness about malware and implement an anti-interference detection framework based on API fragments, which uses the LSTM model to classify API fragments and employs ensemble learning to determine the final result of the entire API sequence. We present our experimental results on Ali-Tianchi contest API databases. By comparing with the experiments of some common methods, it is proved that our method based on local maliciousness has better performance, which is a higher accuracy rate of 0.9734.
Attacker identification from network traffic is a common practice of cyberspace security management. However, network administrators cannot cover all security equipment due to the cyberspace management cost constraints, giving attackers the chance to escape from the surveillance of network security administrators by legitimate actions and to perform the attack in both physical domain and digital domain. Therefore, we proposed a hidden attack sequence detection method based on reinforcement learning to deal with the challenge through modeling the network administrators as an intelligent agent that learns their action policy from the interaction with the cyberspace environment. Following Deep Deterministic Policy Gradient (DDPG), the intelligent agent can not only discover the hidden attackers hiding in the legitimate action sequences but also reduce the cyberspace management cost. Furthermore, a dynamic reward DDPG method was proposed to improve defense performance, which set dynamic reward depending on the hidden attack sequences steps and agent’s check steps, compared to the fixed reward in common methods. Meanwhile, the method was verified in a simulated experimental cyberspace environment. Finally, the experimental results demonstrate that there are hidden attack sequences in cyberspace, and the proposed method can discover the hidden attack sequences. The dynamic reward DDPG shows superior performance in detecting hidden attackers, with a detection rate of 97.46%, which can improve the ability to discover hidden attackers and reduce the 6% cyberspace management cost compared to DDPG.
A DDoS attack is one of the most serious threats to the current Internet. The Router throttling is a popular method to response against DDoS attacks. Currently, coordinated team learning (CTL) has adopted tile coding for continuous state representation and strategy learning. It is suitable for this distributed challenge but lacks robustness. Our first contribution is that we adapt deep network as function approximation for continuous state representation, as a deep reinforcement learning approach is robust in many different Atari games with a little modification of the learning architecture. Furthermore, current multiagent router throttling methods only consider traffic-reading information. Therefore, for a homogeneous team scenario, all agents can share parameters with the same deep network. However, for heterogeneous team scenarios, if all agents still share one deep network, the learning policy may not be sufficiently ideal. Our second contribution is that we add team structure information so that all agents can still share one deep network. However, deep reinforcement learning is a considerably time-consuming task. Transfer learning is an appropriate method because learning policy in a simple scenario allows us to transfer the policy to other different and even complex scenarios. For transfer learning regarding the DDoS control problem, we propose a progressive transfer learning approach, which is our third contribution. Therefore, we can learn a better policy with less time consumption. Moreover, with progressive transfer learning, we can promote our method in a more complex environment. The experimental results validate that our three contributions truly achieve better performance than the existing methods.INDEX TERMS Distributed denial of service, router throttling, deep network, team structure information, multiagent reinforcement learning, progressive transfer learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.