Proceedings of the International Conference on Neural Computation Theory and Applications 2014
DOI: 10.5220/0005082902340243
|View full text |Cite
|
Sign up to set email alerts
|

Q-Routing in Cognitive Packet Network Routing Protocol for MANETs

Abstract: Mobile Ad hoc Networks (MANET) are self-organized networks which are characterized by dynamic topologies in time and space. This creates an instable environment, where classical routing approaches cannot achieve high performance. Thus, adaptive routing is necessary to handle the random changing network topology. This research uses Reinforcement Learning approach with Q-Routing to introduce our MANET routing algorithm: Stability-Aware Cognitive Packet Network (CPN). This new algorithm extends the work on CPN to… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 11 publications
(11 reference statements)
0
4
0
Order By: Relevance
“…We can then re-estimate the values of Q π (s t , a t ) through either ( 13) or (14), and continue to improve the policy using (15). When the process converges, we reach an optimal policy π * (and its corresponding value functions Q π * (s t , a t )).…”
Section: Conventional Q-learningmentioning
confidence: 99%
See 2 more Smart Citations
“…We can then re-estimate the values of Q π (s t , a t ) through either ( 13) or (14), and continue to improve the policy using (15). When the process converges, we reach an optimal policy π * (and its corresponding value functions Q π * (s t , a t )).…”
Section: Conventional Q-learningmentioning
confidence: 99%
“…When estimating Q (which is commonly referred to as policy evaluation), there are two common approaches, each with its own benefits and shortcomings: 1) Monte-Carlo Estimation, corresponding to (13), which provides a unbiased estimation, but has high variance; 2) Temporal Difference Estimation, corresponding to (14), which has low variance, but is biased due to its self-referential nature.…”
Section: Conventional Q-learningmentioning
confidence: 99%
See 1 more Smart Citation
“…Thus, routing can be readily modeled as a Markov decision process [7] and naturally fits into the realm of reinforcement learning. In this direction, many previous works [8,9,10,11,12,13,14,15,16] have employed the classical Q-learning [17] algorithm to train agents to find the optimal route. In these works, a distinct agent is associated with each transmission node.…”
Section: Introductionmentioning
confidence: 99%