Demystifying reinforcement learning approaches for production scheduling

dc.contributor.advisorMeyer, Anne
dc.contributor.authorRinciog, Alexandru
dc.contributor.refereeLiebig, Thomas
dc.date.accepted2023-11-09
dc.date.accessioned2024-02-07T06:54:19Z
dc.date.available2024-02-07T06:54:19Z
dc.date.issued2023
dc.description.abstractRecent years has seen a sharp rise in interest pertaining to Reinforcement Learning (RL) approaches for production scheduling. This is because RL is seen as a an advantageous compromise between the two most typical scheduling solution approaches, namely priority rules and exact approaches. However, there are many variations of both production scheduling problems and RL solutions. Additionally, the RL production scheduling literature is characterized by a lack of standardization, which leads to the field being shrouded in mysticism. The burden of showcasing the exact situations where RL outshines other approaches still lies with the research community. To pave the way towards this goal, we make the following four contributions to the scientific community, aiding in the process of RL demystification. First, we develop a standardization framework for RL scheduling approaches using a comprehensive literature review as a conduit. Secondly, we design and implement FabricatioRL, an open-source benchmarking simulation framework for production scheduling covering a vast array of scheduling problems and ensuring experiment reproducibility. Thirdly, we create a set of baseline scheduling algorithms sharing some of the RL advantages. The set of RL-competitive algorithms consists of a Constraint Programming (CP) meta-heuristic developed by us, CP3, and two simulation-based approaches namely a novel approach we call Simulation Search and Monte Carlo Tree Search. Fourth and finally, we use FabricatioRL to build two benchmarking instances for two popular stochastic production scheduling problems, and run fully reproducible experiments on them, pitting Double Deep Q Networks (DDQN) and AlphaGo Zero (AZ) against the chosen baselines and priority rules. Our results show that AZ manages to marginally outperform priority rules and DDQN, but fails to outperform our competitive baselines.en
dc.identifier.urihttp://hdl.handle.net/2003/42306
dc.identifier.urihttp://dx.doi.org/10.17877/DE290R-24143
dc.language.isoende
dc.subjectDynamic schedulingen
dc.subjectReinforcement learning simulationen
dc.subjectAlphaZeroen
dc.subjectConstraint programmingen
dc.subjectDQNen
dc.subjectMCTSen
dc.subjectPriority rulesen
dc.subject.ddc620
dc.subject.rswkProduktionsplanungde
dc.subject.rswkBestärkendes Lernen <Künstliche Intelligenz>de
dc.subject.rswkConstraint-Programmierungde
dc.subject.rswkMonte-Carlo-Simulationde
dc.subject.rswkProduktionssteuerungde
dc.titleDemystifying reinforcement learning approaches for production schedulingen
dc.typeTextde
dc.type.publicationtypePhDThesisde
dcterms.accessRightsopen access
eldorado.secondarypublicationfalsede

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Dissertation_Rinciog.pdf
Size:
23.75 MB
Format:
Adobe Portable Document Format
Description:
DNB
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.85 KB
Format:
Item-specific license agreed upon to submission
Description: