Abstract
Traditional Reinforcement Learning methods are insufficient for AGIs who must be able to learn to deal with Partially Observable Markov Decision Processes. We investigate a novel method for dealing with this problem: standard RL techniques using as input the hidden layer output of a Sequential Constant-Size Compressor (SCSC). The SCSC takes the form of a sequential Recurrent Auto-Associative Memory, trained through standard back-propagation. Results illustrate the feasibility of this approach - this system learns to deal with high-dimensional visual observations (up to 640 pixels) in partially observable environments where there are long time lags (up to 12 steps) between relevant sensory information and necessary action. © 2011 Springer-Verlag Berlin Heidelberg.
Original language | English (US) |
---|---|
Title of host publication | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Pages | 31-40 |
Number of pages | 10 |
DOIs | |
State | Published - Aug 11 2011 |
Externally published | Yes |
Bibliographical note
Generated from Scopus record by KAUST IRTS on 2022-09-14ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science