Abstract
This paper investigates the impact of reward shaping in multi-agent reinforcement learning as a way to incorporate domain knowledge about good strategies. In theory [2], potential-based reward shaping does not alter the Nash Equilibria of a stochastic game, only the exploration of the shaped agent. We demonstrate empirically the performance of state-based and state-action-based reward shaping in RoboCup KeepAway. The results illustrate that reward shaping can alter both the learning time required to reach a stable joint policy and the final group performance for better or worse.
Original language | English |
---|---|
Title of host publication | 10th International Conference on Autonomous Agents and Multiagent Systems 2011, AAMAS 2011 |
Publisher | International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS) |
Pages | 1157-1158 |
Number of pages | 2 |
Volume | 2 |
Publication status | Published - 2011 |
Event | 10th International Conference on Autonomous Agents and Multiagent Systems 2011, AAMAS 2011 - Taipei, Taiwan Duration: 2 May 2011 → 6 May 2011 |
Conference
Conference | 10th International Conference on Autonomous Agents and Multiagent Systems 2011, AAMAS 2011 |
---|---|
Country/Territory | Taiwan |
City | Taipei |
Period | 2/05/11 → 6/05/11 |
Keywords
- Multiagent Learning
- Reinforcement Learning
- Reward Shaping
- Reward Structures for Learning