Recurrent world models facilitate policy evolution

Research output: Chapter in Book/Report/Conference proceedingConference contribution

195 Scopus citations

Abstract

A generative recurrent neural network is quickly trained in an unsupervised manner to model popular reinforcement learning environments through compressed spatio-temporal representations. The world model's extracted features are fed into compact and simple policies trained by evolution, achieving state of the art results in various environments. We also train our agent entirely inside of an environment generated by its own internal world model, and transfer this policy back into the actual environment. Interactive version of paper: https://worldmodels.github.io.
Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems
PublisherNeural information processing systems foundation
Pages2450-2462
Number of pages13
StatePublished - Jan 1 2018
Externally publishedYes

Fingerprint

Dive into the research topics of 'Recurrent world models facilitate policy evolution'. Together they form a unique fingerprint.

Cite this