In this paper, we propose a three-dimensional autonomous UAV navigation framework using Deep Deterministic Policy Gradient (DDPG) learning approach. The objective is to employ a self-trained UAV as an airborne Internet of Things (IoT) unit to navigate obstacles and reach a destination point, where it can communicate with a ground sensor node with sufficiently high data rate. We develop a customized reward function which aims to minimize the distance separating the UAV and its destination while penalizing collisions. A dynamic energy threshold is also set to redirect the UAV towards the charging station in case of battery depletion. We numerically simulate the behavior of the UAV when learning the environmental obstacles, and autonomously selecting trajectories for selected scenarios. Finally, we show that our learning approach achieves close performance to the one of the graph-based Dijkstra's algorithm.
|Title of host publication
|IEEE World Forum on Internet of Things, WF-IoT 2020 - Symposium Proceedings
|Institute of Electrical and Electronics Engineers Inc.
|Published - Jun 1 2020