A Deep Reinforcement Learning Based Solution for Flexible Job Shop Scheduling Problem

Han, Baoan; Yang, Jianjun

doi:10.2507/ijsimm20-2-co7

Cited by 32 publications

(8 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…where A t is the advantage function computed as the difference between the discounted sum of rewards and the baseline estimate at the state as shown in (21), r t (θ) is the ratio of the new policy to the old policy as described in (22), and is the clipping ratio.…”

Section: Drl Algorithm For Policy Training Since Thementioning

confidence: 99%

See 1 more Smart Citation

Gated‐Attention Model with Reinforcement Learning for Solving Dynamic Job Shop Scheduling Problem

Gebreyesus

Fellek

Farid

et al. 2023

IEEJ Transactions Elec Engng

View full text Add to dashboard Cite

Job shop scheduling problem (JSSP) is one of the well‐known NP‐hard combinatorial optimization problems (COPs) that aims to optimize the sequential assignment of finite machines to a set of jobs while adhering to specified problem constraints. Conventional solution approaches which include heuristic dispatching rules and evolutionary algorithms has been largely in use to solve JSSPs. Recently, the use of reinforcement learning (RL) has gained popularity for delivering better solution quality for JSSPs. In this research, we propose an end‐to‐end deep reinforcement learning (DRL) based scheduling model for solving the standard JSSP. Our DRL model uses attention‐based encoder of Transformer network to embed the JSSP environment represented as a disjunctive graph. We introduced Gate mechanism to modulate the flow of learnt features by preventing noise features from propagating across the network to enrich the representations of nodes of the disjunctive graph. In addition, we designed a novel Gate‐based graph pooling mechanism that preferentially constructs the graph embedding. A simple multi‐layer perceptron (MLP) based action selection network is used for sequentially generating optimal schedules. The model is trained using proximal policy optimization (PPO) algorithm which is built on actor critic (AC) framework. Experimental results show that our model outperforms existing heuristics and state of the art DRL based baselines on generated instances and well‐known public test benchmarks. © 2023 Institute of Electrical Engineers of Japan. Published by Wiley Periodicals LLC.

show abstract

Section: Drl Algorithm For Policy Training Since Thementioning

confidence: 99%

“…Recently the use of DRL methods for solving scheduling problems is gaining attention and, promising results are being obtained in job shop scheduling problem (JSSP) [21], flexible job shop scheduling problem (FJSP) [22], and open shop scheduling problem (OSSP) [23]. In Ref.…”

Section: Review Of Literaturementioning

confidence: 99%

Gated‐Attention Model with Reinforcement Learning for Solving Dynamic Job Shop Scheduling Problem

Gebreyesus

Fellek

Farid

et al. 2023

IEEJ Transactions Elec Engng

View full text Add to dashboard Cite

show abstract

“…In today's complex and varied production processes, dynamic events such as machine breakdown or the change of the processing time and machine order of jobs are inevitable to be considered, necessitating the remarkable results on various combinatorial optimization problems such as traveling salesman problem (TSP) [12], the vehicle routing problem (VRP) [13] and JSP [14]. Existing research has shown that using reinforcement learning to solve DJSP has at least four advantages: 1) RL doesn't require the complete mathematical model or the large labeled datasets of the scheduling environment, but can learn from the interaction with the environment and store the learned knowledge to achieve "offine learning and online application" [15]. 2) Unlike the exiting approaches that have to reschedule the jobs when faced dynamic events, RL can adjust the learning strategy automatically and achieve adaptive scheduling [16].…”

Section: Introductionmentioning

confidence: 99%

You Only Train Once: A highly generalizable reinforcement learning method for dynamic job shop scheduling problem

Zeng¹,

Liao²,

Li³

et al. 2022

Preprint

View full text Add to dashboard Cite

<p>Research in artificial intelligence demonstrates the applicability and flexibility of the reinforcement learning (RL) technique for the dynamic job shop scheduling problem (DJSP). However, the RL-based method will always overfit to the training environment and cannot generalize well to novel unseen situations at deployment time, which is unacceptable in real-world production. For this reason, this paper proposes a highly generalizable reinforcement learning framework named Train Once For All (TOFA) for the dynamic job shop scheduling problem. The trivial and non-trivial states are distinguished when the DJSP is formulated as a semi-Markov decision process, defining the size-agnostic state, action, and reward function. A novel graph representation learning method based on attention mechanism and spatial pyramid pooling is implemented to compress the disjunctive graphs of differentsize DJSP into fixed-length feature vectors. Combining the proposed dynamic frame skipping and an improved prioritized experience replay method that considers the sample quality difference at different training phases. TOFA shows superb generalization capability, outperforms practically favored dispatching rules and even instance-by-instance training RL-based schedulers on various benchmark DJSP. Additionally, we proved that TOFA acquires a transferable scheduling policy that can be used to schedule a whole new DJSP without additional training.</p>

show abstract

“…Machines with intelligent agents evaluate the priorities of jobs and distribute them through negotiation. Han et al [27] proposed an end-to-end DRL framework based on 3D disjunctive graph dispatching. They improved the pointer network and trained the policy with 20 static features and 24 dynamic features that described the full picture of scheduling problem.…”

Section: Introductionmentioning

confidence: 99%

A deep reinforcement learning based approach for dynamic distributed blocking flowshop scheduling with job insertions

Sun

Vogel‐Heuser

et al. 2022

IET Collab Intel Manufact

View full text Add to dashboard Cite

The distributed blocking flowshop scheduling problem (DBFSP) with new job insertions is studied. Rescheduling all remaining jobs after a dynamic event like a new job insertion is unreasonable to an actual distributed blocking flowshop production process. A deep reinforcement learning (DRL) algorithm is proposed to optimise the job selection model, and local modifications are made on the basis of the original scheduling plan when new jobs arrive. The objective is to minimise the total completion time deviation of all products so that all jobs can be finished on time to reduce the cost of storage. First, according to the definitions of the dynamic DBFSP problem, a DRL framework based on multi-agent deep deterministic policy gradient (MADDPG) is proposed. In this framework, a full schedule is generated by the variable neighbourhood descent algorithm before a dynamic event occurs. Meanwhile, all newly added jobs are reordered before the agents make decisions to select the one that needs to be scheduled most urgently. This study defines the observations, actions and reward calculation methods and applies centralised training and distributed execution in MADDPG. Finally, a comprehensive computational experiment is carried out to compare the proposed method with the closely related and well-performing methods. The results indicate that the proposed method can solve the dynamic DBFSP effectively and efficiently. K E Y W O R D S deep reinforcement learning, distributed blocking flowshop scheduling problem, dynamic scheduling, job insertions, multi-agent deep deterministic policy gradientThis is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.

show abstract

A Deep Reinforcement Learning Based Solution for Flexible Job Shop Scheduling Problem

Cited by 32 publications

References 13 publications

Gated‐Attention Model with Reinforcement Learning for Solving Dynamic Job Shop Scheduling Problem

Gated‐Attention Model with Reinforcement Learning for Solving Dynamic Job Shop Scheduling Problem

You Only Train Once: A highly generalizable reinforcement learning method for dynamic job shop scheduling problem

A deep reinforcement learning based approach for dynamic distributed blocking flowshop scheduling with job insertions

Contact Info

Product

Resources

About