Memory-based deep reinforcement learning in endless imperfect information games

Pleines, Marco

Memory-based deep reinforcement learning in endless imperfect information games

dc.contributor.advisor	Rudolph, Günter
dc.contributor.author	Pleines, Marco
dc.contributor.referee	Preuss, Mike
dc.date.accepted	2023-12-11
dc.date.accessioned	2024-03-15T10:07:51Z
dc.date.available	2024-03-15T10:07:51Z
dc.date.issued	2023
dc.description.abstract	Memory capabilities in Deep Reinforcement Learning (DRL) agents have become increasingly crucial, especially in tasks characterized by partial observability or imperfect information. However, the field faces two significant challenges: the absence of a universally accepted benchmark and limited access to open-source baseline implementations. We present "Memory Gym", a novel benchmark suite encompassing both finite and endless versions of the Mortar Mayhem, Mystery Path, and Searing Spotlights environments. The finite tasks emphasize strong dependencies on memory and memory interactions, while the remarkable endless tasks, inspired by the game "I packed my bag", act as an automatic curriculum, progressively challenging an agent's retention and recall capabilities. To complement this benchmark, we provide two comprehensible and open-source baselines anchored on the widely-adopted Proximal Policy Optimization algorithm. The first employs a recurrent mechanism through a Gated Recurrent Unit (GRU) cell, while the second adopts an attention-based approach using Transformer-XL (TrXL) for episodic memory with a sliding window. Given the dearth of readily available transformer-based DRL implementations, our TrXL baseline offers significant value. Our results reveal an intriguing performance dynamic: TrXL is often superior in finite tasks, but in the endless environments, GRU unexpectedly marks a comeback. This discrepancy prompts further investigation into TrXL's potential limitations, including whether its initial query misses temporal cues, the impact of stale hidden states, and the intricacies of positional encoding.	en
dc.identifier.uri	http://hdl.handle.net/2003/42391
dc.identifier.uri	http://dx.doi.org/10.17877/DE290R-24227
dc.language.iso	en	de
dc.subject	Memory-based agents	en
dc.subject	Deep reinforcement learning	en
dc.subject	benchmarking	en
dc.subject	Transformer-XL	en
dc.subject	Gated recurrent unit	en
dc.subject.ddc	004
dc.subject.rswk	Agent <Informatik>	de
dc.subject.rswk	Deep learning	de
dc.subject.rswk	Benchmark	de
dc.title	Memory-based deep reinforcement learning in endless imperfect information games	en
dc.type	Text	de
dc.type.publicationtype	PhDThesis	de
dcterms.accessRights	open access
eldorado.dnb.deposit	true	de
eldorado.secondarypublication	false	de

Dateien

Originalbündel

Gerade angezeigt 1 - 1 von 1

Name:: Dissertation_Pleines.pdf
Größe:: 3.89 MB
Format:: Adobe Portable Document Format
Beschreibung:: DNB

Herunterladen

Lizenzbündel

Gerade angezeigt 1 - 1 von 1

Name:: license.txt
Größe:: 4.85 KB
Format:: Item-specific license agreed upon to submission
Beschreibung:

Herunterladen

Sammlungen

LS 11