This article proposes an optimal-distributed control protocol for multivehicle systems with an unknown switching communication graph. The optimal-distributed control problem is formulated to differential graphical games, and the Pareto optimum to multiplayer games is sought based on the viability theory and reinforcement learning techniques. The viability theory characterizes the controllability of a wide range of constrained nonlinear systems; and the viability kernel and the capture basin are the pillars of the viability theory. The capture basin is the set of all initial states, in which there exist control strategies that enable the states to reach the target in finite time while remaining inside a set before reaching the target. In this regard, the feasible learning region is characterized by the reinforcement learner. In addition, the approximation of the capture basin provides the learner with prior knowledge. Unlike the existing works that employ the viability theory to solve control problems with only one agent and differential games with only two players, the viability theory, in this article, is utilized to solve multiagent control problems and multiplayer differential games. The distributed control law is composed of two parts: 1) the approximation of the capture basin and 2) reinforcement learning, which are computed offline and online, respectively. The convergence properties of the parameters' estimation errors in reinforcement learning are proved, and the convergence of the control policy to the Pareto optimum of the differential graphical game is discussed. The guaranteed approximation results of the capture basin are provided and the simulation results of the differential graphical game are provided for multivehicle systems with the proposed distributed control policy.
|Original language||English (US)|
|Number of pages||14|
|Journal||IEEE Transactions on Cybernetics|
|State||Published - 2021|
Bibliographical noteKAUST Repository Item: Exported on 2021-04-05
Acknowledgements: This work was supported in part by the Zhejiang Laboratory 2019 under Grant NB0AB06; and in part by the National Natural Science Foundation of China under Grant 61860206008, Grant 61773081, Grant 61933012, Grant 61833013, Grant 61991403, and Grant 61991400. This article was recommended by Associate Editor D. Zhao.