Evolving Ensemble Model based on Hilbert Schmidt Independence Criterion for task-free continual learning

Research output: Contribution to journalArticlepeer-review

Abstract

Continual Learning (CL) aims to extend the abilities of deep learning models for continuously acquiring new knowledge without forgetting. However, most CL studies assume that task identities and boundaries are known, which is not a realistic assumption in a real scenario. In this work, we address a more challenging and realistic situation in CL, namely the Task-Free Continuous Learning (TFCL), where an ensemble of experts model is trained on non-stationary data streams without having any task labels. To deal with TFCL, we introduce the Evolving Ensemble Model (EEM), which can dynamically build new experts into a mixture of experts for adapting to the changing data distributions while continuously learning new data sets. To ensure a compact network architecture for EEM during training, we propose a novel expansion mechanism that considers the Hilbert-Schmidt Independence Criterion (HSIC) for evaluating the feature space statistical consistency between the knowledge learned by each expert and the given data. This expansion mechanism does not require storing all previous samples and is more efficient as it performs statistical evaluations in the low-dimensional feature space inferred by a deep network. We also propose a new dropout mechanism for selectively removing unimportant stored samples from the memory buffer used for storing the continuously incoming data before being used for training. The proposed dropout mechanism ensures the diversity of information being learnt by the experts from our model. We perform extensive TFCL tests which show that the proposed approach achieves the state of the art. The source code is available in https://github.com/dtuzi123/HSCI-DEM.
Original languageEnglish
Article number129370
Number of pages15
JournalNeurocomputing
Volume624
Issue number10
Early online date27 Jan 2025
DOIs
Publication statusPublished - 1 Apr 2025

Bibliographical note

This is an author-produced version of the published paper. Uploaded in accordance with the University’s Research Publications and Open Access policy.

Cite this