Dynamic Scalable Self-Attention Ensemble for Task-Free Continual Learning

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Continual learning represents a challenging task for modern deep neural networks due to the catastrophic forgetting following the adaptation of network parameters to new tasks. In this paper, we address a more challenging learning paradigm called Task-Free Continual Learning (TFCL), in which the task information is missing during the training. To deal with this problem, we introduce the Dynamic Scalable Self-Attention Ensemble (DSSAE) model, which dynamically adds new Vision Transformer (ViT) based-experts to deal with the data distribution shift during the training. To avoid frequent expansions and ensure an appropriate number of experts for the model, we propose a new dynamic expansion mechanism that evaluates the novelty of incoming samples as expansion signals. Furthermore, the proposed expansion mechanism does not require knowing the task information or the class label, which can be used in a realistic learning environment. Empirical results demonstrate that the proposed DSSAE achieves state-of-the-art performance in a series of TFCL experiments.
Original languageEnglish
Title of host publication IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PublisherIEEE
Number of pages5
DOIs
Publication statusPublished - 4 Jun 2023
Event2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Rhodes Island, Greece
Duration: 4 Jun 202310 Jun 2023

Conference

Conference2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Abbreviated titleICASSP 2023
Country/TerritoryGreece
CityRhodes Island
Period4/06/2310/06/23

Bibliographical note

This is an author-produced version of the published paper. Uploaded in accordance with the University’s Research Publications and Open Access policy.

Cite this