Deep Reinforcement Learning Based Parameter Control in Differential Evolution

Mudita Sharma, Alexandros Komninos, Manuel López-Ibáñez, Dimitar Lubomirov Kazakov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Adaptive Operator Selection (AOS) is an approach that controls
discrete parameters of an Evolutionary Algorithm (EA) during the
run. In this paper, we propose an AOS method based on Double
Deep Q-Learning (DDQN), a Deep Reinforcement Learning method,
to control the mutation strategies of Differential Evolution (DE).
The application of DDQN to DE requires two phases. First, a neural
network is trained offline by collecting data about the DE state
and the benefit (reward) of applying each mutation strategy during
multiple runs of DE tackling benchmark functions. We define the
DE state as the combination of 99 different features and we ana-
lyze three alternative reward functions. Second, when DDQN is
applied as a parameter controller within DE to a different test set
of benchmark functions, DDQN uses the trained neural network to
predict which mutation strategy should be applied to each parent
at each generation according to the DE state. Benchmark functions
for training and testing are taken from the CEC2005 benchmark
with dimensions 10 and 30. We compare the results of the proposed
DE-DDQN algorithm to several baseline DE algorithms using no
online selection, random selection and other AOS methods, and
also to the two winners of the CEC2005 competition. The results
show that DE-DDQN outperforms the non-adaptive methods for
all functions in the test set, while its results are comparable with
the last two algorithms.
Original languageEnglish
Title of host publicationGECCO '19
Subtitle of host publicationProceedings of the Genetic and Evolutionary Computation Conference
PublisherACM
Pages709-717
ISBN (Electronic)978-1-4503-6111-8
DOIs
Publication statusPublished - 13 Jul 2019

Publication series

NameACM Proceedings
PublisherACM
ISSN (Electronic)2168-4081

Bibliographical note

© This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.

Cite this