Scalable multi-product inventory control with lead time constraints using reinforcement learning

Meisheri, Hardik; Sultana, Nazneen N.; Baranwal, Mayank; Baniwal, Vinita; Nath, Subhrapratim; Verma, Satyam; Ravindran, Balaraman; Khadilkar, Harshad

doi:10.1007/s00521-021-06129-w

Cited by 16 publications

(9 citation statements)

References 53 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, it must be assumed that the complex structure of current SCs, especially global ones with many stages and nodes, the number of variables included in the modeled problem and its intrinsically stochastic condition imply that the modeling of real cases with the reinforcement learning methodology, but without the additional assistance of other methods, constitutes a considerable challenge. Only through the gradual incorporation of the DRL methodology [69], a combination of the reinforcement learning methodology with deep learning-another ML methodology that uses artificial neural networks to transform a set of inputs into a set of outputs, that solve tasks that involve handling complex and high-dimensional raw input data sets [91]-has it been possible to begin to consider the study of SCs with certain complexity, e.g.,: (i) the multistage SC problem of Alves and Mateus [67], validated with a four-stage SC scenario and two nodes per stage, local inventories, lead time, a single product, and demand uncertainty; (ii) the capacitated SC problem of Peng et al [68], validated with a three-stage SC scenario, one node in the first, two in the second and three in the last stage, capacitated production, independent, stochastic and seasonal demand, and a single product; (iii) the case of Meisheri et al [92] who, despite restricting the validation of their retailers' inventory replenishment to the last SC layers, i.e., warehouse and retailer, considers the existence of product variety, with instances of 100 and 220 products-to substantially increase combinatorial computation-and incorporates lead time, limited storage capacity, cross-product restrictions, and weight and volume transportation restrictions. Computational limitations in this regard are manifested as the size of the problem to be solved in terms of the size of the input dataset, and especially the size of the modeled problem's observation space.…”

Section: Discussionmentioning

confidence: 99%

Smart Master Production Schedule for the Supply Chain: A Conceptual Framework

Serrano-Ruiz

Mula

2021

Computers

View full text Add to dashboard Cite

Risks arising from the effect of disruptions and unsustainable practices constantly push the supply chain to uncompetitive positions. A smart production planning and control process must successfully address both risks by reducing them, thereby strengthening supply chain (SC) resilience and its ability to survive in the long term. On the one hand, the antidisruptive potential and the inherent sustainability implications of the zero-defect manufacturing (ZDM) management model should be highlighted. On the other hand, the digitization and virtualization of processes by Industry 4.0 (I4.0) digital technologies, namely digital twin (DT) technology, enable new simulation and optimization methods, especially in combination with machine learning (ML) procedures. This paper reviews the state of the art and proposes a ZDM strategy-based conceptual framework that models, optimizes and simulates the master production schedule (MPS) problem to maximize service levels in SCs. This conceptual framework will serve as a starting point for developing new MPS optimization models and algorithms in supply chain 4.0 (SC4.0) environments.

show abstract

Section: Discussionmentioning

confidence: 99%

Smart Master Production Schedule for the Supply Chain: A Conceptual Framework

Serrano-Ruiz

Mula

2021

Computers

View full text Add to dashboard Cite

show abstract

“…In the retail industry, having multiple products with uncertain demands and different lead times makes determining the optimal inventory replenishment policy highly challenging (Meisheri et al 2021). The authors of Meisheri et al (2021) addressed these challenges in a multi-period and multi-product system using DRL.…”

Section: Inventory Managementmentioning

confidence: 99%

“…Two DRL techniques applied in the reviewed papers are Deep Q-Network (DQN) proposed by Mnih et al (2015) and Proximal Policy Optimization (PPO) introduced by Schulman et al (2017). Meisheri et al (2021) employed both the DQN and PPO methods to determine the optimum replenishment decisions for retail businesses under uncertain demand, having multiple products with different lead times and cross-product constraints. Their results showed a better performance for the DQN.…”

Section: Deep Reinforcement Learning (Drl) and Its Applications In Th...mentioning

confidence: 99%

Applications of deep learning into supply chain management: a systematic literature review and a framework for future research

Shavaki

Ghahnavieh

2022

Artif Intell Rev

View full text Add to dashboard Cite

In today’s complex and ever-changing world, Supply Chain Management (SCM) is increasingly becoming a cornerstone to any company to reckon with in this global era for all industries. The rapidly growing interest in the application of Deep Learning (a class of machine learning algorithms) in SCM, has urged the need for an up-to-date systematic review on the research development. The main purpose of this study is to provide a comprehensive vision by reviewing a set of 43 papers about applications of Deep Learning (DL) methods to the SCM, as well as the trends, perspectives, and potential research gaps. This review uses content analysis to answer three research questions namely: 1- What SCM problems have been solved by the use of DL techniques? 2- What DL algorithms have been used to solve these problems? 3- What alternative algorithms have been used to tackle the same problems? And do DL outperform these methods and through which evaluation metrics? This review also responds to this call by developing a conceptual framework in a value-adding perspective that provides a full picture of areas on where and how DL can be applied within the SCM context. This makes it easier to identify potential applications to corporations, in addition to potential future research areas to science. It might also provide businesses a competitive advantage over their competitors by allowing them to add value to their data by analyzing it quickly and precisely.

show abstract

“…DRL methods were also considered for inventory control of multiple products. In order to drive such problem effectively, [16] utilized multi-agent reinforcement learning (MARL) method to effectively replenish the products without giving unfair treatment to certain products.…”

Section: B Reinforcement Learning In Inventory Managementmentioning

confidence: 99%

Optimization of Apparel Supply Chain Using Deep Reinforcement Learning

2022

View full text Add to dashboard Cite

An effective supply chain management system is indispensable for an enterprise with a supply chain network in several aspects. Especially, organized control over the production and transportation of its products is a key success factor for the enterprise to stay active without damaging its reputation. This case is also highly relevant to garment industries. In this study, an extensive Deep Reinforcement Learning study for apparel supply chain optimization is proposed and undertaken, with focus given to Soft Actor-Critic. Six models are experimented with in this study and are compared with respect to the sell-through rate, service level, and inventory-to-sales ratio. Soft Actor-Critic outperformed several other state-of-the-art Actor Critic models in managing inventories and fulfilling demands. Furthermore, explicit indicators are calculated to assess the performance of the models in the experiment. Soft Actor-Critic achieved a better balance between service level and sell-through rate by ensuring higher availability of the stocks to sell without overstocking.From numerical experiments, it has been shown that S-policy, Trust Region Policy Optimization, and Twin Delayed Deep Deterministic Policy Gradient have a good balance between service level and sell-through rate. Additionally, Soft Actor-Critic achieved a 7%, 41.6%, and 42.8% lower inventory sales ratio than the S-policy, Twin Delayed Deep Deterministic Policy Gradient, and Trust Region Policy Optimization models, indicating its superior ability in making the inventory stocks available to make sales and profit from them.INDEX TERMS Deep reinforcement learning, Inventory management, Markov decision process, Supply chain management, Soft actor critic.

show abstract

Scalable multi-product inventory control with lead time constraints using reinforcement learning

Cited by 16 publications

References 53 publications

Smart Master Production Schedule for the Supply Chain: A Conceptual Framework

Smart Master Production Schedule for the Supply Chain: A Conceptual Framework

Applications of deep learning into supply chain management: a systematic literature review and a framework for future research

Optimization of Apparel Supply Chain Using Deep Reinforcement Learning

Contact Info

Product

Resources

About