Fast Online Q(λ)

Marco Wiering, Jürgen Schmidhuber

Research output: Contribution to journalArticlepeer-review

59 Scopus citations

Abstract

Q(λ)-learning uses TD(λ)-methods to accelerate Q-learning. The update complexity of previous online Q(λ) implementations based on lookup tables is bounded by the size of the state/action space. Our faster algorithm's update complexity is bounded by the number of actions. The method is based on the observation that Q-value updates may be postponed until they are needed.
Original languageEnglish (US)
Pages (from-to)105-115
Number of pages11
JournalMachine Learning
Volume33
Issue number1
DOIs
StatePublished - Jan 1 1998
Externally publishedYes

Bibliographical note

Generated from Scopus record by KAUST IRTS on 2022-09-14

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software

Fingerprint

Dive into the research topics of 'Fast Online Q(λ)'. Together they form a unique fingerprint.

Cite this