Deep Recurrent Belief Propagation Network for POMDPs

Yuhui Wang, Xiaoyang Tan

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    8 Scopus citations

    Abstract

    In many real-world sequential decision-making tasks, especially in continuous control like robotic control, it is rare that the observations are perfect, that is, the sensory data could be incomplete, noisy or even dynamically polluted due to the unexpected malfunctions or intrinsic low quality of the sensors. Previous methods handle these issues in the framework of POMDPs and are either deterministic by feature memorization or stochastic by belief inference. In this paper, we present a new method that lies somewhere in the middle of the spectrum of research methodology identified above and combines the strength of both approaches. In particular, the proposed method, named Deep Recurrent Belief Propagation Network (DRBPN), takes a hybrid style belief updating procedure - an RNN-type feature extraction step followed by an analytical belief inference, significantly reducing the computational cost while faithfully capturing the complex dynamics and maintaining the necessary uncertainty for generalization. The effectiveness of the proposed method is verified on a collection of benchmark tasks, showing that our approach outperforms several state-of-the-art methods under various challenging scenarios.

    Original languageEnglish (US)
    Title of host publication35th AAAI Conference on Artificial Intelligence, AAAI 2021
    PublisherAssociation for the Advancement of Artificial Intelligence
    Pages10236-10244
    Number of pages9
    ISBN (Electronic)9781713835974
    DOIs
    StatePublished - 2021
    Event35th AAAI Conference on Artificial Intelligence, AAAI 2021 - Virtual, Online
    Duration: Feb 2 2021Feb 9 2021

    Publication series

    Name35th AAAI Conference on Artificial Intelligence, AAAI 2021
    Volume11B

    Conference

    Conference35th AAAI Conference on Artificial Intelligence, AAAI 2021
    CityVirtual, Online
    Period02/2/2102/9/21

    Bibliographical note

    Funding Information:
    This work is partially supported by National Science Foundation of China (61732006,61976115), AI+ Project of NUAA(XZA20005,56XZA18009), and research project(315025305), Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX19 0195). We would also like to thank the anonymous reviewers, for offering thoughtful comments and helpful advice on earlier versions of this work.

    Publisher Copyright:
    Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

    ASJC Scopus subject areas

    • Artificial Intelligence

    Fingerprint

    Dive into the research topics of 'Deep Recurrent Belief Propagation Network for POMDPs'. Together they form a unique fingerprint.

    Cite this