2021
DOI: 10.1109/access.2020.3046784
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement Learning With Composite Rewards for Production Scheduling in a Smart Factory

Abstract: His research interest includes machine learning, internet of things, and scheduling of manufacturing systems.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 36 publications
(12 citation statements)
references
References 30 publications
0
10
0
Order By: Relevance
“…In equation ( 3), Percentage_of_errors can be obtained by calculating the percentage of the difference of the predicted value and the answer value, divided by the answer value as shown in equation (4).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…In equation ( 3), Percentage_of_errors can be obtained by calculating the percentage of the difference of the predicted value and the answer value, divided by the answer value as shown in equation (4).…”
Section: Resultsmentioning
confidence: 99%
“…Recently, the manufacturing fields interest in smart factories has increased in response to these changes. Various techniques have been applied in an attempt to reduce the time and cost invested in facility management [4][5][6]. Earlier, facility management relied mainly on the experience and intuition of the person in charge.…”
Section: Introductionmentioning
confidence: 99%
“…For the dynamic multi-objective job shop scheduling problem with just-in-time constraint, Hong and Prabhu [70] modelled the problem as SMDP and introduced a novel scheduling algorithm by using Qlearning. The performance of the algorithm was significantly better than other scheduling rules.…”
Section: Rl For Other Scheduling Problemsmentioning
confidence: 99%
“…We can highlight state space design [12,25,33,107,144,179,193,208,217,220,222,224,227,266,267] and action space design [109,220,246,268], reward construction [14,76,110,199,220,226,246,[269][270][271][272][273], and exploration strategy planning [86,274] which can be determinants from the whole application point of view. [11,13,17,20,21,24,38,43,61,62,66,69,82,89,93], Allocation, assignment, resource management [20,22,…”
Section: Complexitymentioning
confidence: 99%
“…Referred publications Markov decision process [12,23,24,37,64,70,75,84,96,100,101,104,127,130,133,138,144,153,165,167,170,177,188,191,199], [203, 207, 211, 212, 214, 217, 220, 231, 252, 256-259, 263, 264, 272, 274, 281, 291, 309, 313, 320, 340, 343, 346], [369][370][371][372][373][374][375][376] Multiarmed bandit [61,66,102,198,351,377,378] Dynamic programming [16,19,27,52,68,70,84,90,93,…”
Section: Approachmentioning
confidence: 99%