Deep Q-Learning Based Optimization of VLC Systems With Dynamic Time-Division Multiplexing

Siddiqi, Umair F.; Sait, Sadiq M.; Uysal, Murat

doi:10.1109/access.2020.3005885

Cited by 6 publications

(7 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The above optimization problem is a non-convex optimization problem for N > 2 [14]. A non-convex optimization problem has many local minima and it is hard to determine the globally optimal solution [14], [24], [2]. Different methods such as evolutionary algorithm (EA) [25], [26], [27], [28], monotonic optimization [2], DRL [2], etc., can be applied to solve them.…”

Section: System Model and Optimization Problemmentioning

confidence: 99%

“…A state-representation should capture the attributes of the system that are relevant to the decision-making [32]. Before discussing the state-representation, we would to define two functions f L (∆ i ) and f U (∆ i ), as follows We can determine the data-rates of users (R i ) using (2). Please note that the users are sorted in the ascending order of the square of their channel gains, i.e., |h…”

Section: State-representationmentioning

confidence: 99%

“…Q-learning is a popular temporal difference (TD)-based model-free method of RL [2]. It uses a table to store the Q-values of all possibles pairs of states and actions.…”

Section: Introductionmentioning

confidence: 99%

“…In the recent past, the deep reinforcement learning (DRL) approach has successfully been employed to solve optimization problems in many fields of engineering such as non-convex and non-deterministic polynomial-time (NP)hard optimization problems in wireless communications [5], [6], [7], [8], [2]. Non-orthogonal multiple access technique (NOMA) is an innovative multiple-access method proposed for 5G and beyond networks [9], [10], [11], [12].…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Deep Reinforcement Based Power Allocation for the Max-Min Optimization in Non-Orthogonal Multiple Access

2020

Self Cite

View full text Add to dashboard Cite

Section: System Model and Optimization Problemmentioning

confidence: 99%

Section: State-representationmentioning

confidence: 99%

“…Q-learning is a popular temporal difference (TD)-based model-free method of RL [2]. It uses a table to store the Q-values of all possibles pairs of states and actions.…”

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Deep Reinforcement Based Power Allocation for the Max-Min Optimization in Non-Orthogonal Multiple Access

2020

Self Cite

View full text Add to dashboard Cite

“…With the development of artificial intelligence, algorithms like deep Q-network have been widely used for decision making in various practical problems [18][19][20]. e use of reinforcement learning in path planning is increasing and has provided different goal-oriented path planning for various types of vehicles due to its strong performance and high applicability in decision making of path selection.…”

Section: Introductionmentioning

confidence: 99%

Refined Path Planning for Emergency Rescue Vehicles on Congested Urban Arterial Roads via Reinforcement Learning Approach

Yan

Wang

Yang

et al. 2021

Journal of Advanced Transportation

View full text Add to dashboard Cite

Fast road emergency response can minimize the losses caused by traffic accidents. However, emergency rescue on urban arterial roads is faced with the high probability of congestion caused by accidents, which makes the planning of rescue path complicated. This paper proposes a refined path planning method for emergency rescue vehicles on congested urban arterial roads during traffic accidents. Firstly, a rescue path planning environment for emergency vehicles on congested urban arterial roads based on the Markov decision process is established, which focuses on the architecture of arterial roads, taking the traffic efficiency and vehicle queue length into consideration of path planning; then, the prioritized experience replay deep Q-network (PERDQN) reinforcement learning algorithm is used for path planning under different traffic control schemes. The proposed method is tested on the section of East Youyi Road in Xi’an, Shaanxi Province, China. The results show that compared with the traditional shortest path method, the rescue route planned by PERDQN reduces the arrival time to the accident site by 67.1%, and the queue length at upstream of the accident point is shortened by 16.3%, which shows that the proposed method is capable to plan the rescue path for emergency vehicles in urban arterial roads with congestion, shorten the arrival time, and reduce the vehicle queue length caused by accidents.

show abstract

Reinforcement learning-based resource allocation for dynamic aggregated WiFi/VLC HetNet

Luo,

Bai,

Zhang

et al. 2024

Optics Communications

View full text Add to dashboard Cite

Deep Q-Learning Based Optimization of VLC Systems With Dynamic Time-Division Multiplexing

Cited by 6 publications

References 20 publications

Deep Reinforcement Based Power Allocation for the Max-Min Optimization in Non-Orthogonal Multiple Access

Deep Reinforcement Based Power Allocation for the Max-Min Optimization in Non-Orthogonal Multiple Access

Refined Path Planning for Emergency Rescue Vehicles on Congested Urban Arterial Roads via Reinforcement Learning Approach

Reinforcement learning-based resource allocation for dynamic aggregated WiFi/VLC HetNet

Contact Info

Product

Resources

About