The dynamic scheduling problem of semiconductor manufacturing systems (SMSs) is becoming more complicated and challenging due to internal uncertainties and external demand changes. To this end, this paper addresses integrated release control and production scheduling problems with uncertain processing times and urgent orders and proposes a convolutional neural network and asynchronous advanced actor critic-based method (CNN-A3C) that involves a training phase and a deployment phase. In the training phase, actor–critic networks are trained to predict the evaluation of scheduling decisions and to output the optimal scheduling decision. In the deployment phase, the most appropriate release control and scheduling decisions are periodically generated according to the current production status based on the networks. Furthermore, we improve the four key points in the deep reinforcement learning (DRL) algorithm, state space, action space, reward function, and network structure and design four mechanisms: a slide-window-based two-dimensional state perception mechanism, an adaptive reward function that considers multiple objectives and automatically adjusts to dynamic events, a continuous action space based on composite dispatching rules (CDR) and release strategies, and actor–critic networks based on convolutional neural networks (CNNs). To verify the feasibility and effectiveness of the proposed dynamic scheduling method, it is implemented on a simplified SMS. The simulation experimental results show that the proposed method outperforms the unimproved A3C-based method and the common dispatching rules under the new uncertain scenarios.
Fault-tolerant control policies that automatically restart PLC-based aPS during fault recovery can increase system availability. This paper provides a proof-of-concept that such policies can be synthesized with DRL. The authors specifically focus on systems with multiple end-effectors that are actuated in only one or two axes, commonly used for assembly and logistics tasks. Due to the large number of actuators in multi-end-effector systems and the limited possibilities to track workpieces in a single coordinate system, these systems are especially challenging to learn. This paper demonstrates that a hierarchical multi-agent DRL approach together with a separate coordinate prediction module per agent can overcome these challenges. The evaluation of the suggested approach on the simulation of a small laboratory demonstrator shows that it is capable of restarting the system and completing open tasks as part of fault recovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.