2022
DOI: 10.1364/jocn.460629
|View full text |Cite
|
Sign up to set email alerts
|

Techniques for applying reinforcement learning to routing and wavelength assignment problems in optical fiber communication networks

Abstract: We propose a novel application of reinforcement learning (RL) with invalid action masking and a novel training methodology for routing and wavelength assignment (RWA) in fixed-grid optical networks and demonstrate the generalizability of the learned policy to a realistic traffic matrix unseen during training. Through the introduction of invalid action masking and a new training method, the applicability of RL to RWA in fixed-grid networks is extended from considering connection requests between nodes to servic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
5

Relationship

1
9

Authors

Journals

citations
Cited by 21 publications
(18 citation statements)
references
References 28 publications
0
10
0
Order By: Relevance
“…Argotech in the Czech Republic will provide photonic packaging and assembly services. Resolute Photonics in Ireland and Senko Advanced Components in the UK will develop the standardised pluggable subsystems integrating these different photonic integrated circuit technologies to allow them to be deployed in a flexible and reconfigurable manner in a hyperscale data centre system, in order to implement the advanced WDM architecture, developed by University College London in the UK 4 . Finally SME E4 in Italy will advise on data centre design, while Huawei in France will provide overall future network requirements.…”
Section: Dynamos Projectmentioning
confidence: 99%
“…Argotech in the Czech Republic will provide photonic packaging and assembly services. Resolute Photonics in Ireland and Senko Advanced Components in the UK will develop the standardised pluggable subsystems integrating these different photonic integrated circuit technologies to allow them to be deployed in a flexible and reconfigurable manner in a hyperscale data centre system, in order to implement the advanced WDM architecture, developed by University College London in the UK 4 . Finally SME E4 in Italy will advise on data centre design, while Huawei in France will provide overall future network requirements.…”
Section: Dynamos Projectmentioning
confidence: 99%
“…To counter this problem, ML-based RA approaches have been widely studied [65][66][67][68][69][70][71][72][73][74][75][76][77]. In particular, reinforcement learning (RL) is suitable for the RA task because an RL agent tries to maximize the total rewards instead of focusing only on the temporary reward.…”
Section: Resource Allocation Based On MLmentioning
confidence: 99%
“…Even though penalties discourage the agent from choosing invalid actions, it is well-known that in DRL sparse (i.e., infrequent), large rewards are detrimental to convergence. As invalid actions are a priori known by the orchestrator, we leverage invalid action masking in policy gradient algorithms, which has been found to yield significantly better performance and sample efficiency than invalid action penalties [25]- [27]. Action masking works by setting the log-probabilities of invalid actions to −∞ before sampling an action, according to a state-dependant action mask.…”
Section: B Formulating Orchestration In Fog Computing As An Mdpmentioning
confidence: 99%