Abstract
Reward shaping has been shown to significantly improve an agent's performance in reinforcement learning. As attention is shifting from tabula-rasa approaches to methods where some heuristic domain knowledge can be given to agents, an important problem that arises is how can agents deal with erroneous knowledge and what is the impact to their behavior both in a single-as well as a multi-agent setting where agents are faced with conflicting goals. Previous research demonstrated the use of plan-based reward shaping with knowledge revision in a single agent scenario where agents showed that they can quickly identify and revise erroneous knowledge and thus benefit from more accurate plans. Moving to a multi-agent setting the use of individual plans as a source of reward shaping has not been as successful due to the agents' conflicting goals. In this paper we present the use of MDPs as a method to provide heuristic knowledge coupled with a revision algorithm to manage the cases where the provided domain knowledge is wrong. We show how agents can deal with erroneous knowledge in the single agent case and how this method can be used in a multi-agent environment for conflict resolution.
Original language | English |
---|---|
Title of host publication | 13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2014 |
Publisher | International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS) |
Pages | 1535-1536 |
Number of pages | 2 |
Volume | 2 |
ISBN (Electronic) | 9781634391313 |
Publication status | Published - 2014 |
Event | 13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2014 - Paris, France Duration: 5 May 2014 → 9 May 2014 |
Conference
Conference | 13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2014 |
---|---|
Country/Territory | France |
City | Paris |
Period | 5/05/14 → 9/05/14 |
Keywords
- Knowledge revision
- Reinforcement learning
- Reward shaping