An Empirical Study of Potential-Based Reward Shaping and Advice in Complex, Multi-Agent Systems

Research output: Contribution to journalArticlepeer-review

Abstract

This paper investigates the impact of reward shaping in multi-agent reinforcement learning as a way to incorporate domain knowledge about good strategies. In theory, potential-based reward shaping does not alter the Nash Equilibria of a stochastic game, only the exploration of the shaped agent. We demonstrate empirically the performance of reward shaping in two problem domains within the context of RoboCup KeepAway by designing three reward shaping schemes, encouraging specific behaviour such as keeping a
minimum distance from other players on the same team and taking on specific roles.
The results illustrate that reward shaping with multiple, simultaneous learning agents can reduce the time needed to learn a suitable policy and can alter the final group performance.
Original languageEnglish
Pages (from-to)251-278
Number of pages28
JournalAdvances in Complex Systems
Volume14
Issue number2
Publication statusPublished - Apr 2011

Cite this