Nonconvergence to saddle boundary points under perturbed reinforcement learning

Georgios C. Chasparis*, Jeff S. Shamma, Anders Rantzer

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

For several reinforcement learning models in strategic-form games, convergence to action profiles that are not Nash equilibria may occur with positive probability under certain conditions on the payoff function. In this paper, we explore how an alternative reinforcement learning model, where the strategy of each agent is perturbed by a strategy-dependent perturbation (or mutations) function, may exclude convergence to non-Nash pure strategy profiles. This approach extends prior analysis on reinforcement learning in games that addresses the issue of convergence to saddle boundary points. It further provides a framework under which the effect of mutations can be analyzed in the context of reinforcement learning.

Original languageEnglish (US)
Pages (from-to)667-699
Number of pages33
JournalInternational Journal of Game Theory
Volume44
Issue number3
DOIs
StatePublished - Aug 31 2015

Bibliographical note

Publisher Copyright:
© 2014, Springer-Verlag Berlin Heidelberg.

Keywords

  • Learning in games
  • Reinforcement learning
  • Replicator dynamics

ASJC Scopus subject areas

  • Economics and Econometrics
  • Mathematics (miscellaneous)
  • Statistics and Probability
  • Social Sciences (miscellaneous)
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Nonconvergence to saddle boundary points under perturbed reinforcement learning'. Together they form a unique fingerprint.

Cite this