Abstract
Goal-conditioned reinforcement learning (GCRL) aims to control agents to reach desired goals, which poses a significant challenge due to task-specific variations in configurations. However, current GCRL methods suffer from limitations in sample efficiency and the need for substantial training data. While existing self-imitation-based GCRL approaches can improve sample efficiency, their scalability to large-scale tasks is limited. In this paper, we propose integrating self-imitation learning with goal-conditioned RL methods into a compatible and reasonable framework. Specifically, we introduce a novel target action value function to aggregate self-imitation learning and goal-conditioned reinforcement learning. The designed target value effectively combines these two policy training mechanisms to accomplish specific tasks. Moreover, we theoretically demonstrate that our approach can learn a superior policy compared to both self-imitation learning and goal-conditioned reinforcement learning. Additionally, experimental results showcase the stability and effectiveness of our method compared to existing approaches in various challenging robotic control tasks.
Original language | English (US) |
---|---|
Article number | 109845 |
Journal | Pattern Recognition |
Volume | 144 |
DOIs | |
State | Published - Dec 2023 |
Bibliographical note
Funding Information:This work is partially supported by National Science Foundation of China ( 61976115 , 61732006 ), and National Key R&D program of China ( 2021ZD0113203 ), Pre-Research Foundation of ( 50912040302 ).
Funding Information:
This work is partially supported by National Science Foundation of China (61976115,61732006), and National Key R&D program of China (2021ZD0113203), Pre-Research Foundation of (50912040302).
Publisher Copyright:
© 2023 Elsevier Ltd
Keywords
- Behavior cloning
- Deterministic policy gradient
- Goal-conditioned reinforcement learning
- Self-imitation learning
ASJC Scopus subject areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence