Overcoming erroneous domain knowledge in plan-based reward shaping

Kyriakos Efthymiadis, Sam Devlin, Daniel Kudenko

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Reward shaping has been shown to significantly improve an agent's performance in reinforcement learning. Plan-based reward shaping is a successful approach in which a STRIPS plan is used in order to guide the agent to the optimal behaviour. However, if the provided domain knowledge is wrong, it has been shown the agent will take longer to learn the optimal policy. Previously, in some cases, it was better to ignore all prior knowledge despite it only being partially erroneous. This paper introduces a novel use of knowledge revision to overcome erroneous domain knowledge when provided to an agent receiving plan-based reward shaping. Empirical results show that an agent using this method can outperform the previous agent receiving plan-based reward shaping without knowledge revision.

Original languageEnglish
Title of host publication12th International Conference on Autonomous Agents and Multiagent Systems 2013, AAMAS 2013
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Pages1245-1246
Number of pages2
Volume2
Publication statusPublished - 2013
Event12th International Conference on Autonomous Agents and Multiagent Systems 2013, AAMAS 2013 - Saint Paul, MN, United States
Duration: 6 May 201310 May 2013

Conference

Conference12th International Conference on Autonomous Agents and Multiagent Systems 2013, AAMAS 2013
Country/TerritoryUnited States
CitySaint Paul, MN
Period6/05/1310/05/13

Keywords

  • Knowledge Revision
  • Reinforcement Learning
  • Reward Shaping

Cite this