Bohao Qu scite author profile

In Wireless sensor network, node error, energy depletion and other factors will lead to the appearance of hole which will cause network failure. In order to make the network more efficient, repair method based on the hybrid network model is proposed, namely activating a number of non-active nodes and calling mobile node to patching hole. This paper proposes two strategies: (1) Wake up the non-active nodes to reduce the hole area. It is proposed based on convex hull area reduction algorithm greedy algorithm for patching hole. (2) Call mobile node to fill hole gaps. Each mobile node covers more intersection arc of hole. The paper gives a Hole Repair Algorithm (HSNHRA, Hybrid Sensor Network Hole Repair Algorithm). Finally, the simulation results show the effectiveness of the proposed scheme, and the comparative analysis based on the experimental results shows the performance of the proposed scheme. It enables hole completely repaired, and the coverage and utilization of nodes have been improved.

show abstract

Policy Dispersion in Non-Markovian Environment

Qu¹,

Cao²,

Yang³

et al. 2023

Preprint

View full text Add to dashboard Cite

Markov Decision Process (MDP) presents a mathematical framework to formulate the learning processes of agents in reinforcement learning. MDP is limited by the Markovian assumption that a reward only depends on the immediate state and action. However, a reward sometimes depends on the history of states and actions, which may result in the decision process in a non-Markovian environment. In such environments, agents receive rewards via temporally-extended behaviors sparsely, and the learned policies may be similar. This leads the agents acquired with similar policies generally overfit to the given task and can not quickly adapt to perturbations of environments. To resolve this problem, this paper tries to learn the diverse policies from the history of state-action pairs under a non-Markovian environment, in which a policy dispersion scheme is designed for seeking diverse policy representation. Specifically, we first adopt a transformer-based method to learn policy embeddings. Then, we stack the policy embeddings to construct a dispersion matrix to induce a set of diverse policies. Finally, we prove that if the dispersion matrix is positive definite, the dispersed embeddings can effectively enlarge the disagreements across policies, yielding a diverse expression for the original policy embedding distribution. Experimental results show that this dispersion scheme can obtain more expressive diverse policies, which then derive more robust performance than recent learning baselines under various learning environments.

show abstract

A Multi-Agent Deep Reinforcement Learning Method for Fully Noisy Observations

Wang¹,

Zhang²,

Qu³

et al. 2023

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Bohao Qu

Multi-agent hierarchical policy gradient for Air Combat Tactics emergence via self-play

Beyond-Visual-Range Air Combat Tactics Auto-Generation by Reinforcement Learning

Hole Repair Algorithm in Hybrid Sensor Networks

Policy Dispersion in Non-Markovian Environment

A Multi-Agent Deep Reinforcement Learning Method for Fully Noisy Observations

Contact Info

Product

Resources

About