Zheng, Ruijie scite author profile

Evaluating the worst-case performance of a reinforcement learning (RL) agent under the strongest/optimal adversarial perturbations on state observations (within some constraints) is crucial for understanding the robustness of RL agents. However, finding the optimal adversary is challenging, in terms of both whether we can find the optimal attack and how efficiently we can find it. Existing works on adversarial RL either use heuristics-based methods that may not find the strongest adversary, or directly train an RL-based adversary by treating the agent as a part of the environment, which can find the optimal adversary but may become intractable in a large state space. In this paper, we propose a novel attacking algorithm which has an RL-based "director" searching for the optimal policy perturbation, and an "actor" crafting state perturbations following the directions from the director (i.e. the actor executes targeted attacks). Our proposed algorithm, PA-AD, is theoretically optimal against an RL agent and significantly improves the efficiency compared with prior RL-based works in environments with large or pixel state spaces. Empirical results show that our proposed PA-AD universally outperforms state-of-the-art attacking methods in a wide range of environments. Our method can be easily applied to any RL algorithms to evaluate and improve their robustness.Preprint. Under review.

show abstract

Dual-emission samarium macrocycle as a lab-on-a-molecule enables high-throughput discrimination of anionic sulfonate surfactants

Huang

Shen

et al. 2021

Sensors and Actuators B: Chemical

View full text Add to dashboard Cite

Transfer RL across Observation Feature Spaces via Model-Based Regularization

Sun¹,

Ruijie²,

Wang³

et al. 2022

Preprint

View full text Add to dashboard Cite

In many reinforcement learning (RL) applications, the observation space is specified by human developers and restricted by physical realizations, and may thus be subject to dramatic changes over time (e.g. increased number of observable features). However, when the observation space changes, the previous policy will likely fail due to the mismatch of input features, and another policy must be trained from scratch, which is inefficient in terms of computation and sample complexity. Following theoretical insights, we propose a novel algorithm which extracts the latent-space dynamics in the source task, and transfers the dynamics model to the target task to use as a model-based regularizer. Our algorithm works for drastic changes of observation space (e.g. from vector-based observation to image-based observation), without any inter-task mapping or any prior knowledge of the target task. Empirical results show that our algorithm significantly improves the efficiency and stability of learning in the target task. * The work was done while the author was an intern at Unity Technologies.

show abstract

The Liquid-Mediated Synthesis and Performance Evaluation of Li-Zr-F Composite for Ion-Conduction

Moon¹,

Thiangtham²,

Ruijie³

et al. 2023

J Energ Power Technol

View full text Add to dashboard Cite

Crystalline lithium fluoride (LiF) has been intensively pursued as potential alternative solid electrolytes (SEs) owing to its excellent chemical and electrochemical oxidation stability, and good deformability. However, due to its low ion conductivity, LiF is still challenging for practical SE applications. Herein, Li-Zr-F composite-based SE by liquid-mediated synthesis is proposed to be studied. methanol (CH3OH) was mainly evaluated as a liquid-mediated precursor for synthesizing Li-Zr-F composites under the stoichiometric proportion of LiF and ZrF4 (2:1 and 2:0.8) and a subsequent annealing process at 25°C/150°C, 50°C/150°C, and 70°C/150°C, respectively. X-ray diffraction results revealed that the Li-Zr-F composites could be crystallized in the three main types of phase formations, including Li2ZrF6 ( ), Li2ZrF6 ( ), and Li4ZrF8 ( ) octahedron structures. In addition, the effect of cation stack sublattice synthesized by methanol mediator on the ion conduction of Li-Zr-F composites was investigated by using electrochemical impedance spectroscopy (EIS). Through the Zr4+-substitution, Li2ZrF6 ( )-based SE exhibited the highest ion conduction which was increased to 2.40 × 10-8 S/cm and 3.89 × 10-8 S/cm under the stoichiometric proportion of LiF and ZrF4 2:0.8 at a dried temperature of 50°C/150°C with, respectively. A 0.21 eV activation energy ( ) was achieved for a battery with Li2ZrF6 ( )-based SE. Meanwhile, LiF exhibited up to 0.78 eV leading to a low kinetic rate for ion diffusion. These results implied that Li2ZrF6 ( )-based SE was successfully synthesized under the optimal condition of CH3OH-50°C/150°C which could improve the ion-conductivity of LiF.

show abstract

Highly selective and discriminative detection of small alcohols based on a dual-emission macrocyclic samarium complex

Zhang

Ruijie

et al. 2023

Anal. Methods

View full text Add to dashboard Cite

show abstract

Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems

Sun¹,

Ruijie²,

Hassanzadeh³

et al. 2022

Preprint

View full text Add to dashboard Cite

Communication is important in many multi-agent reinforcement learning (MARL) problems for agents to share information and make good decisions. However, when deploying trained communicative agents in a real-world application where noise and potential attackers exist, the safety of communication-based policies becomes a severe issue that is underexplored. Specifically, if communication messages are manipulated by malicious attackers, agents relying on untrustworthy communication may take unsafe actions that lead to catastrophic consequences. Therefore, it is crucial to ensure that agents will not be misled by corrupted communication, while still benefiting from benign communication. In this work, we consider an environment with N agents, where the attacker may arbitrarily change the communication from any C < N −1 2 agents to a victim agent. For this strong threat model, we propose a certifiable defense by constructing a message-ensemble policy that aggregates multiple randomly ablated message sets. Theoretical analysis shows that this message-ensemble policy can utilize benign communication while being certifiably robust to adversarial communication, regardless of the attacking algorithm. Experiments in multiple environments verify that our defense significantly improves the robustness of trained policies against various types of attacks.

show abstract

A susceptible coordination hybrid based terbium sensibilization coupled ESIPT effects for pattern discrimination of analogues

Tai

Ruijie

et al. 2023

Analytica Chimica Acta

View full text Add to dashboard Cite

Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning

Liang¹,

Sun²,

Ruijie³

et al. 2022

Preprint

View full text Add to dashboard Cite

Recent studies reveal that a well-trained deep reinforcement learning (RL) policy can be particularly vulnerable to adversarial perturbations on input observations. Therefore, it is crucial to train RL agents that are robust against any attacks with a bounded budget. Existing robust training methods in deep RL either treat correlated steps separately, ignoring the robustness of long-term rewards, or train the agents and RL-based attacker together, doubling the computational burden and sample complexity of the training process. In this work, we propose a strong and efficient robust training framework for RL, named Worst-case-aware Robust RL (WocaR-RL), that directly estimates and optimizes the worst-case reward of a policy under bounded p attacks without requiring extra samples for learning an attacker. Experiments on multiple environments show that WocaR-RL achieves state-ofthe-art performance under various strong attacks, and obtains significantly higher training efficiency than prior state-of-the-art robust training methods. The code of this work is available at https://github.com/umd-huang-lab/WocaR-RL.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zheng, Ruijie

Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL

Dual-emission samarium macrocycle as a lab-on-a-molecule enables high-throughput discrimination of anionic sulfonate surfactants

Transfer RL across Observation Feature Spaces via Model-Based Regularization

The Liquid-Mediated Synthesis and Performance Evaluation of Li-Zr-F Composite for Ion-Conduction

Highly selective and discriminative detection of small alcohols based on a dual-emission macrocyclic samarium complex

Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems

A susceptible coordination hybrid based terbium sensibilization coupled ESIPT effects for pattern discrimination of analogues

Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning

Contact Info

Product

Resources

About