Abstract
Adaptive Operator Selection (AOS) is an approach that controls
discrete parameters of an Evolutionary Algorithm (EA) during the
run. In this paper, we propose an AOS method based on Double
Deep Q-Learning (DDQN), a Deep Reinforcement Learning method,
to control the mutation strategies of Differential Evolution (DE).
The application of DDQN to DE requires two phases. First, a neural
network is trained offline by collecting data about the DE state
and the benefit (reward) of applying each mutation strategy during
multiple runs of DE tackling benchmark functions. We define the
DE state as the combination of 99 different features and we ana-
lyze three alternative reward functions. Second, when DDQN is
applied as a parameter controller within DE to a different test set
of benchmark functions, DDQN uses the trained neural network to
predict which mutation strategy should be applied to each parent
at each generation according to the DE state. Benchmark functions
for training and testing are taken from the CEC2005 benchmark
with dimensions 10 and 30. We compare the results of the proposed
DE-DDQN algorithm to several baseline DE algorithms using no
online selection, random selection and other AOS methods, and
also to the two winners of the CEC2005 competition. The results
show that DE-DDQN outperforms the non-adaptive methods for
all functions in the test set, while its results are comparable with
the last two algorithms.
discrete parameters of an Evolutionary Algorithm (EA) during the
run. In this paper, we propose an AOS method based on Double
Deep Q-Learning (DDQN), a Deep Reinforcement Learning method,
to control the mutation strategies of Differential Evolution (DE).
The application of DDQN to DE requires two phases. First, a neural
network is trained offline by collecting data about the DE state
and the benefit (reward) of applying each mutation strategy during
multiple runs of DE tackling benchmark functions. We define the
DE state as the combination of 99 different features and we ana-
lyze three alternative reward functions. Second, when DDQN is
applied as a parameter controller within DE to a different test set
of benchmark functions, DDQN uses the trained neural network to
predict which mutation strategy should be applied to each parent
at each generation according to the DE state. Benchmark functions
for training and testing are taken from the CEC2005 benchmark
with dimensions 10 and 30. We compare the results of the proposed
DE-DDQN algorithm to several baseline DE algorithms using no
online selection, random selection and other AOS methods, and
also to the two winners of the CEC2005 competition. The results
show that DE-DDQN outperforms the non-adaptive methods for
all functions in the test set, while its results are comparable with
the last two algorithms.
Original language | English |
---|---|
Title of host publication | GECCO '19 |
Subtitle of host publication | Proceedings of the Genetic and Evolutionary Computation Conference |
Publisher | ACM |
Pages | 709-717 |
ISBN (Electronic) | 978-1-4503-6111-8 |
DOIs | |
Publication status | Published - 13 Jul 2019 |
Publication series
Name | ACM Proceedings |
---|---|
Publisher | ACM |
ISSN (Electronic) | 2168-4081 |