The two-dimensional organization of behavior

Mark Ring, Tom Schaul, Juergen Schmidhuber

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Scopus citations

Abstract

This paper addresses the problem of continual learning [1] in a new way, combining multi-modular reinforcement learning with inspiration from the motor cortex to produce a unique perspective on hierarchical behavior. Most reinforcement-learning agents represent policies monolithically using a single table or function approximator. In those cases where the policies are split among a few different modules, these modules are related to each other only in that they work together to produce the agent's overall policy. In contrast, the brain appears to organize motor behavior in a two-dimensional map, where nearby locations represent similar behaviors. This representation allows the brain to build hierarchies of motor behavior that correspond not to hierarchies of subroutines but to regions of the map such that larger regions correspond to more general behaviors. Inspired by the benefits of the brain's representation, the system presented here is a first step and the first attempt toward the two-dimensional organization of learned policies according to behavioral similarity. We demonstrate a fully autonomous multi-modular system designed for the constant accumulation of ever more sophisticated skills (the continual-learning problem). The system can split up a complex task among a large number of simple modules such that nearby modules correspond to similar policies. The eventual goal is to develop and use the resulting organization hierarchically, accessing behaviors by their location and extent in the map. © 2011 IEEE.
Original languageEnglish (US)
Title of host publication2011 IEEE International Conference on Development and Learning, ICDL 2011
DOIs
StatePublished - Nov 1 2011
Externally publishedYes

Bibliographical note

Generated from Scopus record by KAUST IRTS on 2022-09-14

Fingerprint

Dive into the research topics of 'The two-dimensional organization of behavior'. Together they form a unique fingerprint.

Cite this