Reinforcement Learning in RoboCup KeepAway with Partial Observability

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Partially observable environments pose a major challenge to the application of reinforcement learning algorithms. In such environments, due to the Markov property frequently being violated in the system state representation, situations can occur where an agent has insufficient information to decide on the optimal action. In such cases, it is necessary to determine when information gathering actions should be executed, that is, when the agent needs to reduce uncertainty about the current state before deciding on how to act. One possible solution that has been proposed in past research is to manually code rules for execution of information gathering actions in the policy using heuristic (and likely faulty) knowledge. However such a solution requires explicit expert knowledge about actions which are information gathering.

In this paper a flexible solution is proposed which automatically learns when to execute information gathering actions and furthermore to automatically discover which actions gather information. We present an evaluation in the RoboCup KeepAway domain that empirically shows the robustness of the proposed approach and its success in learning under varying degrees of partial observability. Hence, it eliminates the need for hand-coded rules, is flexible in different situations and does not require knowledge about information gathering actions.

Original languageEnglish
Title of host publication2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 2
EditorsR BaezaYates, B Berendt, E Bertino, EP Lim, G Pasi
Place of PublicationLOS ALAMITOS
PublisherIEEE Computer Society
Pages201-208
Number of pages8
Volume2
ISBN (Print)978-1-4244-5331-3
Publication statusPublished - 2009
EventIEEE/WIC/ACM International Conferences on Web Intelligence (WI)/Intelligent Agent Technologies (IAT), - Milan
Duration: 15 Sept 200918 Sept 2009

Conference

ConferenceIEEE/WIC/ACM International Conferences on Web Intelligence (WI)/Intelligent Agent Technologies (IAT),
CityMilan
Period15/09/0918/09/09

Keywords

  • Belief state
  • KeepAway
  • partial observability
  • POMDP
  • reinforcement learning

Cite this