Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

Supervised and Reinforcement Learning from Observations in Reconnaissance Blind Chess

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

Abstract

In this work, we adapt a training approach inspired by the original AlphaGo system to play the imperfect information game of Reconnaissance Blind Chess. Using only the observations instead of a full description of the game state, we first train a supervised agent on publicly available game records. Next, we increase the performance of the agent through self-play with the on-policy reinforcement learning algorithm Proximal Policy optimization. We do not use any search to avoid problems caused by the partial observability of game states and only use the policy network to generate moves when playing. With this approach, we achieve an ELO of 1330 on the RBC leaderboard, which places our agent at position 27 at the time of this writing. We see that self-play significantly improves performance and that the agent plays acceptably well without search and without making assumptions about the true game state.
OriginalspracheEnglisch
TitelProceedings of the IEEE Conference on Games (CoG)
ErscheinungsortBeijing, China
VerlagIEEE
Seiten608-611
Seitenumfang4
ISBN (elektronisch)9781665459891
DOIs
PublikationsstatusVeröffentlicht - 2022

Publikationsreihe

NameIEEE Conference on Computatonal Intelligence and Games, CIG
Band2022-August
ISSN (Print)2325-4270
ISSN (elektronisch)2325-4289

Wissenschaftszweige

  • 102001 Artificial Intelligence
  • 102019 Machine Learning
  • 509014 Spielforschung

JKU-Schwerpunkte

  • Digital Transformation

Dieses zitieren