2018
DOI: 10.1109/tac.2017.2747409
|View full text |Cite
|
Sign up to set email alerts
|

Stochastic Online Shortest Path Routing: The Value of Feedback

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
51
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 48 publications
(51 citation statements)
references
References 22 publications
0
51
0
Order By: Relevance
“…This method uses one path for exploitation and possibly another path for exploration. Talebi et al [17] proposed to model the path planning process as a Markov decision process (MDP), which coincides with our model in this paper, and proposed the KL-Hop-by-Hop Routing (KL-HHR) algorithm. However, their method has to combine with line search and Bellman-Ford algorithm to choose the next node for a path.…”
Section: A Related Workmentioning
confidence: 62%
See 2 more Smart Citations
“…This method uses one path for exploitation and possibly another path for exploration. Talebi et al [17] proposed to model the path planning process as a Markov decision process (MDP), which coincides with our model in this paper, and proposed the KL-Hop-by-Hop Routing (KL-HHR) algorithm. However, their method has to combine with line search and Bellman-Ford algorithm to choose the next node for a path.…”
Section: A Related Workmentioning
confidence: 62%
“…In this section, we compare the performance of proposed algorithms with KL-Hop-by-Hop routing (KL-HHR) algorithm [17], combinatorial upper confidence bound (CUCB) algorithm [33], and Thompson sampling (TS) algorithm [34] applied for this problem. The details of KL-HHR algorithm can be found in [17]. The CUCB algorithm chooses a whole path for the nth packet according to the following strategy [33]…”
Section: Experimental Results and Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…Even more broadly, the multi-agent version of classical bandit applications is a natural extension [38]. For example, in online shortest path routing problem [34,39], as another classic example of bandit applications, multi-agent setting could capture the case in which the underlying network is large and each agent is responsible for routing within a sub-graph in the network. Last, it is also plausible that the to have asynchronous learning among different agents in the sense that each agent has its own action rate for decision making.…”
Section: Motivating Applicationmentioning
confidence: 99%
“…From the past several years, many researches have been done in order to reduce the negative impact of latency in the transmission and communication activities of the fog-IoT model. Talebi et al [1] studied about how to find the shortest path for routing in multi-hop network online. Wang et al [2] have proposed such an optimal method which finds the optimal service placement of the micro clouds thereby reducing the average cost over time.…”
Section: Related Workmentioning
confidence: 99%