This paper studies security strategies in two-player zero-sum repeated Bayesian games with finite horizon. In such games, each player has a private type which is independently chosen according to a publicly known a priori probability. Players' types are fixed all through the game. The game is played for finite stages. At every stage, players simultaneously choose their actions which are observed by the public. The one-stage payoff of player 1 (or penalty to player 2) depends on both players types and actions, and is not directly observed by any player. While player 1 aims to maximize the total payoff over the game, player 2 wants to minimize it. This paper provides each player two ways to compute the security strategy, i.e. the optimal strategy in the worst case. First, a security strategy that directly depends on both players' history actions is derived by refining the sequence form. Noticing that history action space grows exponentially with respect to the time horizon, this paper further presents a security strategy that depends on player's fixed sized sufficient statistics. The sufficient statistics is shown to consist of the belief on one's own type, the regret on the other player's type, and the stage, and is independent of the other player's strategy.
|Title of host publication
|2017 American Control Conference (ACC)
|Institute of Electrical and Electronics Engineers (IEEE)
|Number of pages
|Published - Jul 10 2017
Bibliographical noteKAUST Repository Item: Exported on 2020-10-01
Acknowledgements: The authors acknowledge the financial support of ARO project #W911NF-09-1-0553 and the AFOSR/MURI project #FA9550-10-1-0573.