2021
DOI: 10.1007/978-3-030-87897-9_21
|View full text |Cite
|
Sign up to set email alerts
|

Applying and Comparing Policy Gradient Methods to Multi-echelon Supply Chains with Uncertain Demands and Lead Times

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
0
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 6 publications
0
0
0
Order By: Relevance
“…Currently, most mainstream RL methods are based on this architecture. Alves and Silva [19] used a shared policy in a supply chain collaboration environment to compare the performance of different single-agent RL algorithms, such as Deep Deterministic Policy Gradient (DDPG) [24], Soft Actor-Critic (SAC) [25], and Proximal Policy Optimization (PPO) [26], and the implementation results showed that the PPO algorithm performed the best. In this study, all homogenous agents used the same policy for inventory management.…”
Section: Inventory Management Methods With Actor-critic Rlmentioning
confidence: 99%
See 1 more Smart Citation
“…Currently, most mainstream RL methods are based on this architecture. Alves and Silva [19] used a shared policy in a supply chain collaboration environment to compare the performance of different single-agent RL algorithms, such as Deep Deterministic Policy Gradient (DDPG) [24], Soft Actor-Critic (SAC) [25], and Proximal Policy Optimization (PPO) [26], and the implementation results showed that the PPO algorithm performed the best. In this study, all homogenous agents used the same policy for inventory management.…”
Section: Inventory Management Methods With Actor-critic Rlmentioning
confidence: 99%
“…As different participants in the supply chain play different roles and require different types and quantities of materials for inventory management, some differences exist in inventory management methods and goals. Suppose a single-agent RL algorithm [18] or homogeneous MARL algorithm [19] is used to address this problem. In that case, the model's efficiency will be limited to some extent.…”
Section: Methods Overviewmentioning
confidence: 99%