The objective of this work is to provide a qualitative description of the transient properties of stochastic learning dynamics like adaptive play, log-linear learning, and Metropolis learning. The solution concept used in these learning dynamics for potential games is that of stochastic stability, which is based on the stationary distribution of the reversible Markov chain representing the learning process. However, time to converge to a stochastically stable state is exponential in the inverse of noise, which limits the use of stochastic stability as an effective solution concept for these dynamics. We propose a complete solution concept that qualitatively describes the state of the system at all times. The proposed concept is prevalent in control systems literature where a solution to a linear or a non-linear system has two parts, transient response and steady state response. Stochastic stability provides the steady state response of stochastic learning rules. In this work, we study its transient properties. Starting from an initial condition, we identify the subsets of the state space called cycles that have small hitting times and long exit times. Over the long time scales, we provide a description of how the distributions over joint action profiles transition from one cycle to another till it reaches the globally optimal state.
Bibliographical noteKAUST Repository Item: Exported on 2020-10-01
Acknowledgements: Research supported by funding from KAUST.