A Sojourn-Based Approach to Semi-Markov Reinforcement Learning

Ascione, Giacomo; Cuomo, Salvatore

doi:10.1007/s10915-022-01876-x

Cited by 3 publications

(1 citation statement)

References 29 publications

(71 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The idea of this method is to decompose a whole task into multi-level subtasks by introducing mechanisms such as State space decomposition ( Takahashi, 2001 ), State abstraction ( Abel, 2019 ), and Temporal abstraction ( Bacon and Precup, 2018 ) so that each subtask can be solved in a small-scale state space, thus speeding up the solution of the whole task. To model these abstract mechanisms, researchers introduced the semi-Markov Decision Process (SMDP) ( Ascione and Cuomo, 2022 ) model to handle actions that must be completed at multiple time steps. The state space decomposition approach decomposes the state space into different subsets.…”

Section: Related Workmentioning

confidence: 99%

Intelligent air defense task assignment based on hierarchical reinforcement learning

et al. 2022

View full text Add to dashboard Cite

Modern air defense battlefield situations are complex and varied, requiring high-speed computing capabilities and real-time situational processing for task assignment. Current methods struggle to balance the quality and speed of assignment strategies. This paper proposes a hierarchical reinforcement learning architecture for ground-to-air confrontation (HRL-GC) and an algorithm combining model predictive control with proximal policy optimization (MPC-PPO), which effectively combines the advantages of centralized and distributed approaches. To improve training efficiency while ensuring the quality of the final decision. In a large-scale area air defense scenario, this paper validates the effectiveness and superiority of the HRL-GC architecture and MPC-PPO algorithm, proving that the method can meet the needs of large-scale air defense task assignment in terms of quality and speed.

show abstract

Section: Related Workmentioning

confidence: 99%

Intelligent air defense task assignment based on hierarchical reinforcement learning

et al. 2022

View full text Add to dashboard Cite

show abstract

Discrete-Time Semi-Markov Chains

Limnios¹,

Swishchuk²

2023

Probability and Its Applications

View full text Add to dashboard Cite

Skorokhod Reflection Problem for Delayed Brownian Motion with Applications to Fractional Queues

2022

View full text Add to dashboard Cite

Several queueing systems in heavy traffic regimes are shown to admit a diffusive approximation in terms of the Reflected Brownian Motion. The latter is defined by solving the Skorokhod reflection problem on the trajectories of a standard Brownian motion. In recent years, fractional queueing systems have been introduced to model a class of queueing systems with heavy-tailed interarrival and service times. In this paper, we consider a subdiffusive approximation for such processes in the heavy traffic regime. To do this, we introduce the Delayed Reflected Brownian Motion by either solving the Skorohod reflection problem on the trajectories of the delayed Brownian motion or by composing the Reflected Brownian Motion with an inverse stable subordinator. The heavy traffic limit is achieved via the continuous mapping theorem. As a further interesting consequence, we obtain a simulation algorithm for the Delayed Reflected Brownian Motion via a continuous-time random walk approximation.

show abstract

A Sojourn-Based Approach to Semi-Markov Reinforcement Learning

Cited by 3 publications

References 29 publications

Intelligent air defense task assignment based on hierarchical reinforcement learning

Intelligent air defense task assignment based on hierarchical reinforcement learning

Discrete-Time Semi-Markov Chains

Skorokhod Reflection Problem for Delayed Brownian Motion with Applications to Fractional Queues

Contact Info

Product

Resources

About