The Monte Carlo Tree Search (MCTS) has demonstrated excellent performance in solving many planning problems. However, the state space and the branching factors are huge, and the planning horizon is long in many practical applications, especially in the adversarial environment. It is computationally expensive to cover a sufficient number of rewarded states that are far away from the root in the flat non-hierarchical MCTS. Therefore, the flat non-hierarchical MCTS is inefficient for dealing with planning problems with a long planning horizon, huge state space, and branching factors. In this work, we propose a novel hierarchical MCTS-based online planning method named the HMCTS-OP to tackle this issue. The HMCTS-OP integrates the MAXQ-based task hierarchies and the hierarchical MCTS algorithms into the online planning framework. Specifically, the MAXQ-based task hierarchies reduce the search space and guide the search process. Therefore, the computational complexity is significantly reduced. Moreover, the reduction in the computational complexity enables the MCTS to perform a deeper search to find better action in a limited time. We evaluate the performance of the HMCTS-OP in the domain of online planning in the asymmetric adversarial environment. The experiment results show that the HMCTS-OP outperforms other online planning methods in this domain.
Hierarchical skill learning is an important research direction in human intelligence. However, many real-world problems have sparse rewards and a long time horizon, which typically pose challenges in hierarchical skill learning and lead to the poor performance of naive exploration. In this work, we propose an algorithmic framework called surprise-based hierarchical exploration for model and skill learning (Surprise-HEL). The framework leverages the surprise-based intrinsic motivation for improving the efficiency of sampling and driving exploration. It also combines the surprise-based intrinsic motivation and the hierarchical exploration to speed up the model learning and skill learning. Moreover, the framework incorporates the reward independent incremental learning rules and the technique of alternating model learning and policy update to handle the changing intrinsic rewards and the changing models. These works enable the framework to implement the incremental and developmental learning of models and hierarchical skills. We tested Surprise-HEL on a common benchmark domain: Household Robot Pickup and Place. The evaluation results show that the Surprise-HEL framework can significantly improve the agent’s efficiency in model and skill learning in a typical complex domain.
Multiplex networks have attracted more and more attention because they can model the coupling of network nodes between layers more accurately. The interaction of nodes between layers makes the attack effect on multiplex networks not simply a linear superposition of the attack effect on single-layer networks, and the disintegration of multiplex networks has become a research hotspot and difficult. Traditional multiplex network disintegration methods generally adopt approximate and heuristic strategies. However, these two methods have a number of drawbacks and fail to meet our requirements in terms of effectiveness and timeliness. In this paper, we develop a novel deep learning framework, called MINER (Multiplex network disintegration strategy Inference based on deep NEtwork Representation learning), which transforms the disintegration strategy inference of multiplex networks into the encoding and decoding process based on deep network representation learning. In the encoding process, the attention mechanism encodes the coupling relationship of corresponding nodes between layers, and reinforcement learning is adopted to evaluate the disintegration action in the decoding process. Experiments indicate that the trained MINER model can be directly transferred and applied to the disintegration of multiplex networks with different scales. We extend it to scenarios that consider node attack cost constraints and also achieve excellent performance. This framework provides a new way to understand and employ multiplex networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.