TY - JOUR
T1 - Overcoming incorrect knowledge in plan-based reward shaping
AU - Efthymiadis, Kyriakos
AU - Devlin, Sam
AU - Kudenko, Daniel
PY - 2016/2/11
Y1 - 2016/2/11
N2 - Reward shaping has been shown to significantly improve an agent's performance in reinforcement learning. Plan-based reward shaping is a successful approach in which a STRIPS plan is used in order to guide the agent to the optimal behaviour. However, if the provided knowledge is wrong, it has been shown the agent will take longer to learn the optimal policy. Previously, in some cases, it was better to ignore all prior knowledge despite it only being partially incorrect. This paper introduces a novel use of knowledge revision to overcome incorrect domain knowledge when provided to an agent receiving plan-based reward shaping. Empirical results show that an agent using this method can outperform the previous agent receiving plan-based reward shaping without knowledge revision.
AB - Reward shaping has been shown to significantly improve an agent's performance in reinforcement learning. Plan-based reward shaping is a successful approach in which a STRIPS plan is used in order to guide the agent to the optimal behaviour. However, if the provided knowledge is wrong, it has been shown the agent will take longer to learn the optimal policy. Previously, in some cases, it was better to ignore all prior knowledge despite it only being partially incorrect. This paper introduces a novel use of knowledge revision to overcome incorrect domain knowledge when provided to an agent receiving plan-based reward shaping. Empirical results show that an agent using this method can outperform the previous agent receiving plan-based reward shaping without knowledge revision.
UR - http://www.scopus.com/inward/record.url?scp=84958165165&partnerID=8YFLogxK
U2 - 10.1017/S026988891500017X
DO - 10.1017/S026988891500017X
M3 - Article
AN - SCOPUS:84958165165
SN - 0269-8889
VL - 31
SP - 31
EP - 43
JO - The Knowledge Engineering Review
JF - The Knowledge Engineering Review
IS - 1
ER -