In this study, we propose a reinforcement learning (RL) approach for minimizing the number of work overload situations in the mixed model sequencing (MMS) problem with stochastic processing times. The learning environment simulates stochastic processing times and penalizes work overloads with negative rewards. To account for the stochastic component of the problem, we implement a state representation that specifies whether work overloads will occur if the processing times are equal to their respective 25%, 50%, and 75% probability quantiles. Thereby, the RL agent is guided toward minimizing the number of overload situations while being provided with statistical information about how fluctuations in processing times affect the solution quality. To the best of our knowledge, this study is the first to consider the stochastic problem variation with a minimization of overload situations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.