Abstract
Potential-based reward shaping can signicantly improve
the time needed to learn an optimal policy and, in multi-
agent systems, the performance of the nal joint-policy. It
has been proven to not alter the optimal policy of an agent
learning alone or the Nash equilibria of multiple agents learn-
ing together.
However, a limitation of existing proofs is the assumption
that the potential of a state does not change dynamically
during the learning. This assumption often is broken, espe-
cially if the reward-shaping function is generated automati-
cally.
In this paper we prove and demonstrate a method of ex-
tending potential-based reward shaping to allow dynamic
shaping and maintain the guarantees of policy invariance in
the single-agent case and consistent Nash equilibria in the
multi-agent case.
the time needed to learn an optimal policy and, in multi-
agent systems, the performance of the nal joint-policy. It
has been proven to not alter the optimal policy of an agent
learning alone or the Nash equilibria of multiple agents learn-
ing together.
However, a limitation of existing proofs is the assumption
that the potential of a state does not change dynamically
during the learning. This assumption often is broken, espe-
cially if the reward-shaping function is generated automati-
cally.
In this paper we prove and demonstrate a method of ex-
tending potential-based reward shaping to allow dynamic
shaping and maintain the guarantees of policy invariance in
the single-agent case and consistent Nash equilibria in the
multi-agent case.
Original language | English |
---|---|
Title of host publication | Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems |
Publisher | IFAAMAS |
Pages | 433-440 |
Number of pages | 8 |
ISBN (Print) | 978-0-9817381-3-0 |
Publication status | Published - Jun 2012 |
Event | 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2012) - Valencia, Spain Duration: 4 Jun 2012 → 8 Jun 2012 |
Conference
Conference | 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2012) |
---|---|
Country/Territory | Spain |
City | Valencia |
Period | 4/06/12 → 8/06/12 |