Moving Target Defense (MTD) is an emerging proactive defense technology, which can reduce the risk of vulnerabilities exploited by attacker. As a crucial component of MTD, route mutation (RM) faces a few fundamental problems defending against sophisticated Distributed Denial of Service (DDoS) attacks: 1) It's unable to make optimal mutation selection due to insufficient learning in attack behaviors. 2) Because network situation is time-varying, RM also lacks self-adaptation in mutation parameters. In this paper, we propose a contextaware Q-learning algorithm for RM (CQ-RM) that can learn attack strategies to optimize the selection of mutated routes. We firstly integrate four representative attack strategies into a unified mathematical model and formalize multiple network constraints. Then, taking above network constraints into considerations, we model RM process as a Markov decision process (MDP). To look for the optimal policy of MDP, we develop a context estimation mechanism and further propose the CQ-RM scheme, which can adjust learning rate and mutation period adaptively. Correspondingly, the optimal convergence of CQ-RM is proved theoretically. Finally, extensive experimental results highlight the effectiveness of our method compared to representative solutions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.