Adversarial Examples Construction Towards White-Box Q Table Variation in DQN Pathfinding Training

Bai, Xiaoxuan; Niu, Wenjia; Liu, Jiqiang; Gao, Xu; Xiang, Yingxiao; Liu, Jingjing

doi:10.1109/dsc.2018.00126

Cited by 23 publications

(14 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Other research also using DQN in optimal control has shown successful results by simulation. For instance, in [31], it exploited DQN to find the optimal path of a robotic agent in a simple 2D environment with a limited number of states and no uncertainties (a 15 × 15 grid). DQN was also used for path planning of a ground robot in the seekavoid arena 01, a virtual environment on the DeepMind Lab platform containing some visual obstacles [32].…”

Section: Discussionmentioning

confidence: 99%

Deep Reinforcement Learning for Drone Delivery

Muñoz

Barrado

Çetin

et al. 2019

Drones

View full text Add to dashboard Cite

Drones are expected to be used extensively for delivery tasks in the future. In the absence of obstacles, satellite based navigation from departure to the geo-located destination is a simple task. When obstacles are known to be in the path, pilots must build a flight plan to avoid them. However, when they are unknown, there are too many or they are in places that are not fixed positions, then to build a safe flight plan becomes very challenging. Moreover, in a weak satellite signal environment, such as indoors, under trees canopy or in urban canyons, the current drone navigation systems may fail. Artificial intelligence, a research area with increasing activity, can be used to overcome such challenges. Initially focused on robots and now mostly applied to ground vehicles, artificial intelligence begins to be used also to train drones. Reinforcement learning is the branch of artificial intelligence able to train machines. The application of reinforcement learning to drones will provide them with more intelligence, eventually converting drones in fully-autonomous machines. In this work, reinforcement learning is studied for drone delivery. As sensors, the drone only has a stereo-vision front camera, from which depth information is obtained. The drone is trained to fly to a destination in a neighborhood environment that has plenty of obstacles such as trees, cables, cars and houses. The flying area is also delimited by a geo-fence; this is a virtual (non-visible) fence that prevents the drone from entering or leaving a defined area. The drone has to avoid visible obstacles and has to reach a goal. Results show that, in comparison with the previous results, the new algorithms have better results, not only with a better reward, but also with a reduction of its variance. The second contribution is the checkpoints. They consist of saving a trained model every time a better reward is achieved. Results show how checkpoints improve the test results.

show abstract

Section: Discussionmentioning

confidence: 99%

Deep Reinforcement Learning for Drone Delivery

Muñoz

Barrado

Çetin

et al. 2019

Drones

View full text Add to dashboard Cite

show abstract

“…Other studies of adversarial attacks on the specific application of DRL for path-finding have also been conducted by (Xiang et al 2018) and (Bai et al 2018), which results in the RL agent failing to find a path to the goal or planning a path that is more costly.…”

Section: Adversarial Attacks On Rl Agentmentioning

confidence: 99%

Spatiotemporally Constrained Action Space Attacks on Deep Reinforcement Learning Agents

Lee

Ghadai

Tan

et al. 2020

AAAI

View full text Add to dashboard Cite

Robustness of Deep Reinforcement Learning (DRL) algorithms towards adversarial attacks in real world applications such as those deployed in cyber-physical systems (CPS) are of increasing concern. Numerous studies have investigated the mechanisms of attacks on the RL agent's state space. Nonetheless, attacks on the RL agent's action space (corresponding to actuators in engineering systems) are equally perverse, but such attacks are relatively less studied in the ML literature. In this work, we first frame the problem as an optimization problem of minimizing the cumulative reward of an RL agent with decoupled constraints as the budget of attack. We propose the white-box Myopic Action Space (MAS) attack algorithm that distributes the attacks across the action space dimensions. Next, we reformulate the optimization problem above with the same objective function, but with a temporally coupled constraint on the attack budget to take into account the approximated dynamics of the agent. This leads to the white-box Look-ahead Action Space (LAS) attack algorithm that distributes the attacks across the action and temporal dimensions. Our results showed that using the same amount of resources, the LAS attack deteriorates the agent's performance significantly more than the MAS attack. This reveals the possibility that with limited resource, an adversary can utilize the agent's dynamics to malevolently craft attacks that causes the agent to fail. Additionally, we leverage these attack strategies as a possible tool to gain insights on the potential vulnerabilities of DRL agents.

show abstract

“…Based on the SPA algorithm introduced above, Bai et al (2018) proposed that they first use DQN to find the optimal path, and analyzed the rules of DQN pathfinding. They proposed a method that can effectively find vulnerable points towards White-Box Q-table variation in DQN pathfinding training.…”

Section: White-box Based Adversarial Attack On Dqn (Wba)mentioning

confidence: 99%

“…FGSM (Goodfellow et al 2014a), SPA (Xiang et al 2018), WBA (Bai et al 2018), and CDG (Chen et al 2018b) belong to White-box attack, which have access to the details related to training algorithm and corresponding parameters of the target model. Meanwhile, the PIA (Behzadan and Munir 2017), STA (Lin et al 2017), EA (Lin et al 2017), and AVI (Liu et al 2017) are Black-box attacks, in which adversary has no idea of the details related to training algorithm and corresponding parameters of the model, for the threat model discussed in these literatures, authors assumed that the adversary has access to the training environment bat has no idea of the random initializations of the target policy, and additionally does not know what the learning algorithm is.…”

Section: Summary For Adversarial Attack In Reinforcement Learningmentioning

confidence: 99%

“…For instance, in the field of Atari game, Lin et al (2017) proposed a "strategicallytimed attack" whose adversarial example at each time step is computed independently of the adversarial examples at other time steps, instead of attacking a deep RL agent at every time step (see "Black-box attack" section). Moreover, in the terms of automatic path planning, Liu et al (2017), Xiang et al (2018), Bai et al (2018) and Chen et al (2018b) all proposed methods which can take adversarial attack on reinforcement learning algorithms (VIN (Tamar et al 2016), Q-Learning (Watkins and Dayan 1992), DQN (Mnih et al 2013), A3C (Mnih et al 2016)) under automatic path planning tasks (see "Defense technology against adversarial attack" section).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Adversarial attack and defense in reinforcement learning-from AI security view

et al. 2019

Self Cite

View full text Add to dashboard Cite

Reinforcement learning is a core technology for modern artificial intelligence, and it has become a workhorse for AI applications ranging from Atrai Game to Connected and Automated Vehicle System (CAV). Therefore, a reliable RL system is the foundation for the security critical applications in AI, which has attracted a concern that is more critical than ever. However, recent studies discover that the interesting attack mode adversarial attack also be effective when targeting neural network policies in the context of reinforcement learning, which has inspired innovative researches in this direction. Hence, in this paper, we give the very first attempt to conduct a comprehensive survey on adversarial attacks in reinforcement learning under AI security. Moreover, we give briefly introduction on the most representative defense technologies against existing adversarial attacks.

show abstract

Adversarial Examples Construction Towards White-Box Q Table Variation in DQN Pathfinding Training

Cited by 23 publications

References 4 publications

Deep Reinforcement Learning for Drone Delivery

Deep Reinforcement Learning for Drone Delivery

Spatiotemporally Constrained Action Space Attacks on Deep Reinforcement Learning Agents

Adversarial attack and defense in reinforcement learning-from AI security view

Contact Info

Product

Resources

About