It is difficult to apply traditional reinforcement learning algorithms to robots, due to problems with large and continuous domains, partial observability, and limited numbers of learning experiences. This paper deals with these problems by combining: 1. reinforcement learning with memory, implemented using an LSTM recurrent neural network whose inputs are discrete events extracted from raw inputs; 2. online exploration and offline policy learning. An experiment with a real robot demonstrates the methodology's feasibility.
|Original language||English (US)|
|Title of host publication||IEEE International Conference on Intelligent Robots and Systems|
|Number of pages||6|
|State||Published - Dec 26 2003|