Multi-dimensional deep memory atari-Go players for parameter exploring policy gradients

Mandy Grüttner, Frank Sehnke, Tom Schaul, Jürgen Schmidhuber

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

Developing superior artificial board-game players is a widely-studied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, despite its deceivingly simple rules, has eluded the development of artificial expert players. In this paper we attempt to tackle this challenge through a combination of two recent developments in Machine Learning. We employ Multi-Dimensional Recurrent Neural Networks with Long Short-Term Memory cells to handle the multi-dimensional data of the board game in a very natural way. In order to improve the convergence rate, as well as the ultimate performance, we train those networks using Policy Gradients with Parameter-based Exploration, a recently developed Reinforcement Learning algorithm which has been found to have numerous advantages over Evolution Strategies. Our empirical results confirm the promise of this approach, and we discuss how it can be scaled up to expert-level Go players. © 2010 Springer-Verlag Berlin Heidelberg.
Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages114-123
Number of pages10
DOIs
StatePublished - Nov 8 2010
Externally publishedYes

Bibliographical note

Generated from Scopus record by KAUST IRTS on 2022-09-14

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Multi-dimensional deep memory atari-Go players for parameter exploring policy gradients'. Together they form a unique fingerprint.

Cite this