By the same authors

Dynamic Potential-Based Reward Shaping

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Author(s)

Department/unit(s)

Publication details

Title of host publicationProceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems
DatePublished - Jun 2012
Pages433-440
Number of pages8
PublisherIFAAMAS
Original languageEnglish
ISBN (Print)978-0-9817381-3-0

Abstract

Potential-based reward shaping can signicantly improve
the time needed to learn an optimal policy and, in multi-
agent systems, the performance of the nal joint-policy. It
has been proven to not alter the optimal policy of an agent
learning alone or the Nash equilibria of multiple agents learn-
ing together.
However, a limitation of existing proofs is the assumption
that the potential of a state does not change dynamically
during the learning. This assumption often is broken, espe-
cially if the reward-shaping function is generated automati-
cally.
In this paper we prove and demonstrate a method of ex-
tending potential-based reward shaping to allow dynamic
shaping and maintain the guarantees of policy invariance in
the single-agent case and consistent Nash equilibria in the
multi-agent case.

Discover related content

Find related publications, people, projects, datasets and more using interactive charts.

View graph of relations