A Policy Gradient Based Reinforcement Learning Method for Supply Chain Management

Hachaı̈chi, Yassine; Chemingui, Yassine; Affes, M.

doi:10.1109/ic_aset49463.2020.9318258

Cited by 7 publications

(4 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similarly to BC, DRL has experienced exponential growth in the last decade [20,21], and there is currently strong interest in exploring the use of DRL to improve SC network performance (Table 1) [22][23][24][25][26]; however, [22] is the only work reported in the literature that introduces a distributed collaborative dynamic access control scheme utilizing DRL, and redefining network security architecture by combining anomaly detection, dynamic updates to user trust profiles, and collaborative adjustments for mitigation policies, to the best of authors knowledge. This scheme addresses the escalating challenge of insider threats in network security.…”

Section: Related Workmentioning

confidence: 99%

“…Traffic Allocation MADDPG based optimize traffic allocation policy for adaptive and automatic collaborative management, considering network security, network environment, and user requirements. 2020 [24] PPO Order Placement Development of a reinforcement learning agent for optimal order placement and inventory replenishment in SC management.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Blockchain-Based Zero Trust Supply Chain Security Integrated with Deep Reinforcement Learning

Ismail,

Moudoud,

Dawoud

et al. 2024

Preprint

View full text Add to dashboard Cite

The modern supply chain (SC) is growing in terms of data, devices, users, and stakeholders, which introduced new security challenges and threats, especially with the reliance on centralized servers or cloud platforms. In addition, increased trust among system participants exposes the SC to a higher risk of vulnerabilities which require strong security measures. This article proposes a hybrid security framework for SC systems, BC-DRLzSC, that integrates Blockchain (BC) and Deep Reinforcement Learning (DRL) designed to operate in a zero trust (ZT) environment. In particular, we propose a decentralized BC-based approach integrated with smart contracts to manage system participant registration and authentication and to control access to system resources. BC-DRLzSC adopts a ZT architecture to reinforce SC security, which can be achieved with an advocate to verify each entity’s trustworthiness before granting or retaining access to system resources. Incorporating the ZT architecture, with BC and DRL, can potentially and significantly bolster SC system security. DRL is employed to develop a proactive attack detection model that continuously monitors the incoming traffic from authenticated nodes within the network and predicts any malicious actions. Finally, we evaluate the performance of our proposed DRL solution using the NSL-KDD dataset.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Blockchain-Based Zero Trust Supply Chain Security Integrated with Deep Reinforcement Learning

Ismail,

Moudoud,

Dawoud

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

“…Intuition-based approaches are replaced by supply chain computerized solutions such as inventory management, warehousing, allocation, and replenishment. Hachaïchi et al (2020) aim at building a reinforcement learning agent capable of placing optimal orders for the sake of constructing a replenishment plan for next period. Current supply chain efficiency management methods cannot effectively control the risk caused by inefficient supply chain management.…”

Section: Literature Reviewmentioning

confidence: 99%

Research on Optimization Strategies for Closed-Loop Supply Chain Management Based on Deep Learning Technology

Gao

2024

International Journal of Information Systems and Supply Chain Management

View full text Add to dashboard Cite

This study explores the integration of deep learning (DL) technology and the guided simulated annealing algorithm (GSAA) to optimize closed-loop supply chains (CLSC) for sustainable development. By applying DL for predictive analysis and GSAA for optimization, the research aims to enhance CLSC operational efficiency and environmental sustainability. The methodology combines a review of the CLSC framework with practical applications of DL and GSAA, aiming to reduce waste, maximize resource utilization, and minimize environmental impact. An experimental comparison of this approach against traditional optimization strategies demonstrates the proposed method's superior effectiveness and efficiency. The findings reveal that the DL-GSAA optimization significantly improves CLSC sustainability and efficiency, with GSAA showing promising convergence properties. This study underscores the importance of advanced technological solutions in achieving sustainable supply chain management, offering practical insights for businesses and supply chain managers.

show abstract

“…Experiments show that using DQN for one node achieves better results than using the base-stock policy for all nodes. Hachaïchi et al (2020) use PPO and DDPG to solve an inventory replenishment problem in a two-echelon supply chain. There is one distribution center and three stores, with local capacitated stocks.…”

Section: Related Workmentioning

confidence: 99%

Multi-echelon Supply Chains with Uncertain Seasonal Demands and Lead Times Using Deep Reinforcement Learning

Alves¹,

Mateus²

2022

Preprint

View full text Add to dashboard Cite

We address the problem of production planning and distribution in multi-echelon supply chains. We consider uncertain demands and lead times which makes the problem stochastic and non-linear. A Markov Decision Process formulation and a Non-linear Programming model are presented. As a sequential decision-making problem, Deep Reinforcement Learning (RL) is a possible solution approach. This type of technique has gained a lot of attention from Artificial Intelligence and Optimization communities in recent years. Considering the good results obtained with Deep RL approaches in different areas there is a growing interest in applying them in problems from the Operations Research field. We have used a Deep RL technique, namely Proximal Policy Optimization (PPO2), to solve the problem considering uncertain, regular and seasonal demands and constant or stochastic lead times. Experiments are carried out in different scenarios to better assess the suitability of the algorithm. An agent based on a linearized model is used as a baseline. Experimental results indicate that PPO2 is a competitive and adequate tool for this type of problem. PPO2 agent is better than baseline in all scenarios with stochastic lead times (7.3-11.2%), regardless of whether demands are seasonal or not. In scenarios with constant lead times, the PPO2 agent is better when uncertain demands are non-seasonal (2.2-4.7%). The results show that the greater the uncertainty of the scenario, the greater the viability of this type of approach.

show abstract

A Policy Gradient Based Reinforcement Learning Method for Supply Chain Management

Cited by 7 publications

References 10 publications

Blockchain-Based Zero Trust Supply Chain Security Integrated with Deep Reinforcement Learning

Blockchain-Based Zero Trust Supply Chain Security Integrated with Deep Reinforcement Learning

Research on Optimization Strategies for Closed-Loop Supply Chain Management Based on Deep Learning Technology

Multi-echelon Supply Chains with Uncertain Seasonal Demands and Lead Times Using Deep Reinforcement Learning

Contact Info

Product

Resources

About