Learning Cooperative Behaviours in Adversarial Multi-agent Systems

Ni Wang*, Gautham Das, Alan Gregory Millard

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This work extends an existing virtual multi-agent platform called RoboSumo to create TripleSumo—a platform for investigating multi-agent cooperative behaviors in continuous action spaces, with physical contact in an adversarial environment. In this paper we investigate a scenario in which two agents, namely ‘Bug’ and ‘Ant’, must team up and push another agent ‘Spider’ out of the arena. To tackle this goal, the newly added agent ‘Bug’ is trained during an ongoing match between ‘Ant’ and ‘Spider’. ‘Bug’ must develop awareness of the other agents’ actions, infer the strategy of both sides, and eventually learn an action policy to cooperate. The reinforcement learning algorithm Deep Deterministic Policy Gradient (DDPG) is implemented with a hybrid reward structure combining dense and sparse rewards. The cooperative behavior is quantitatively evaluated by the mean probability of winning the match and mean number of steps needed to win.
Original languageEnglish
Title of host publicationTowards Autonomous Robotic Systems
Subtitle of host publication23rd Annual Conference, TAROS 2022, Culham, UK, September 7–9, 2022, Proceedings
PublisherSpringer
Pages179-189
Number of pages11
ISBN (Electronic)9783031159084
ISBN (Print)9783031159077
DOIs
Publication statusPublished - 1 Sept 2022
Event23rd Annual Conference Towards Autonomous Robotic Systems, TAROS 2022 - Culham, United Kingdom
Duration: 7 Sept 20229 Sept 2022

Publication series

NameLecture Notes in Computer Science (LNCS)
PublisherSpringer
Volume13546
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference23rd Annual Conference Towards Autonomous Robotic Systems, TAROS 2022
Country/TerritoryUnited Kingdom
CityCulham
Period7/09/229/09/22

Cite this