Demystifying reinforcement learning approaches for production scheduling

Rinciog, Alexandru

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Meyer, Anne	-
dc.contributor.author	Rinciog, Alexandru	-
dc.date.accessioned	2024-02-07T06:54:19Z	-
dc.date.available	2024-02-07T06:54:19Z	-
dc.date.issued	2023	-
dc.identifier.uri	http://hdl.handle.net/2003/42306	-
dc.identifier.uri	http://dx.doi.org/10.17877/DE290R-24143	-
dc.description.abstract	Recent years has seen a sharp rise in interest pertaining to Reinforcement Learning (RL) approaches for production scheduling. This is because RL is seen as a an advantageous compromise between the two most typical scheduling solution approaches, namely priority rules and exact approaches. However, there are many variations of both production scheduling problems and RL solutions. Additionally, the RL production scheduling literature is characterized by a lack of standardization, which leads to the field being shrouded in mysticism. The burden of showcasing the exact situations where RL outshines other approaches still lies with the research community. To pave the way towards this goal, we make the following four contributions to the scientific community, aiding in the process of RL demystification. First, we develop a standardization framework for RL scheduling approaches using a comprehensive literature review as a conduit. Secondly, we design and implement FabricatioRL, an open-source benchmarking simulation framework for production scheduling covering a vast array of scheduling problems and ensuring experiment reproducibility. Thirdly, we create a set of baseline scheduling algorithms sharing some of the RL advantages. The set of RL-competitive algorithms consists of a Constraint Programming (CP) meta-heuristic developed by us, CP3, and two simulation-based approaches namely a novel approach we call Simulation Search and Monte Carlo Tree Search. Fourth and finally, we use FabricatioRL to build two benchmarking instances for two popular stochastic production scheduling problems, and run fully reproducible experiments on them, pitting Double Deep Q Networks (DDQN) and AlphaGo Zero (AZ) against the chosen baselines and priority rules. Our results show that AZ manages to marginally outperform priority rules and DDQN, but fails to outperform our competitive baselines.	en
dc.language.iso	en	de
dc.subject	Dynamic scheduling	en
dc.subject	Reinforcement learning simulation	en
dc.subject	AlphaZero	en
dc.subject	Constraint programming	en
dc.subject	DQN	en
dc.subject	MCTS	en
dc.subject	Priority rules	en
dc.subject.ddc	620	-
dc.title	Demystifying reinforcement learning approaches for production scheduling	en
dc.type	Text	de
dc.contributor.referee	Liebig, Thomas	-
dc.date.accepted	2023-11-09	-
dc.type.publicationtype	PhDThesis	de
dc.subject.rswk	Produktionsplanung	de
dc.subject.rswk	Bestärkendes Lernen <Künstliche Intelligenz>	de
dc.subject.rswk	Constraint-Programmierung	de
dc.subject.rswk	Monte-Carlo-Simulation	de
dc.subject.rswk	Produktionssteuerung	de
dcterms.accessRights	open access	-
eldorado.secondarypublication	false	de
Appears in Collections:	Fakultät Maschinenbau

Files in This Item:

File	Description	Size	Format
Dissertation_Rinciog.pdf	DNB	24.32 MB	Adobe PDF	View/Open

This item is protected by original copyright

View License

Show simple item record

This item is protected by original copyright rightsstatements.org