In the dynamic world, deep neural networks (DNNs) must continually adapt to new data and environments. Unlike humans, who can learn continually without forgetting past knowledge, DNNs often suffer from catastrophic forgetting when exposed to new data, causing them to lose previously acquired information. This limitation hinders their ability to learn and perform multiple tasks over time. Continual learning [1, 2, 3] is crucial for enabling models to retain old knowledge while learning new tasks, ensuring long-term stability and adaptability. However, neural networks are still over-parameterized compared to the human brain. Therefore, this thesis aims to explore the potential of incorporating sparsity in continual learning to address these challenges effectively.
We will investigate various continual learning methods and also various pruning strategies. While
magnitude-based pruning removes weights with the smallest magnitudes, dynamic sparse training
adjusts sparsity patterns during training, allowing for an adaptive model. The lottery ticket hypothesis
[4] identifies a sparse subnetwork that can be retrained for new tasks. Elastic Weight Consolidation
(EWC) [5] and Synaptic Intelligence (SI) [6] protect important weights by adding regularization terms
to the loss function, penalizing significant changes. Sparse evolutionary training iteratively evolves
sparse structures by adding or removing connections. Structured sparsity involves pruning entire
neurons, filters, or layers, leading to efficient and interpretable models. Additionally, we also investigate
sharpness-aware minimization (SAM) based optimization [7] and low-rank adaptation in the CL setting.
By integrating these techniques, we aim to develop sparsity-driven continual learning methods that are
both efficient and resilient to forgetting, drawing inspiration from the human brain’s ability to learn
incrementally and retain essential knowledge.
Primary contact - Shruthi Gowda (s.gowda@tue.nl)