Theoretical Considerations of Potential-Based Reward Shaping for Multi-Agent Systems

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Potential-based reward shaping has previously been proven
to both be equivalent to Q-table initialisation and guarantee
policy invariance in single-agent reinforcement learning.
The method has since been used in multi-agent reinforcement
learning without consideration of whether the theoretical
equivalence and guarantees hold. This paper extends
the existing proofs to similar results in multi-agent systems,
providing the theoretical background to explain the success
of previous empirical studies. Specically, it is proven
that the equivalence to Q-table initialisation remains and
the Nash Equilibria of the underlying stochastic game are
not modied. Furthermore, we demonstrate empirically that
potential-based reward shaping eects exploration and,
consequentially, can alter the joint policy converged upon.
Original languageEnglish
Title of host publicationThe 10th International Conference on Autonomous Agents and Multiagent Systems
PublisherACM
Pages225-232
ISBN (Electronic)0-9826571-5-3
ISBN (Print)978-0-9826571-5-7
Publication statusPublished - May 2011
EventTenth International Conference on Autonomous Agents and Multi-Agent Systems - Taipeh, Taiwan
Duration: 2 May 2011 → …

Conference

ConferenceTenth International Conference on Autonomous Agents and Multi-Agent Systems
Country/TerritoryTaiwan
CityTaipeh
Period2/05/11 → …

Cite this