Continual Learning (CL) is a learning paradigm in which computational systems progressively acquire multiple tasks as new data becomes available over time. An effective CL system must find a balance between being adaptable to integrate new information and maintaining stability to prevent disruption of previously learned knowledge. In deep neural networks, the necessity for adaptability to learn new tasks can result in significant adjustments to weights, leading to a phenomenon known as catastrophic forgetting [1]. This phenomenon often leads to a rapid decline in performance and, in severe cases, the replacement of previously acquired knowledge with new information. It is important to note that the challenge of catastrophic forgetting is not limited to CL alone, but also affects multitask learning and supervised learning in the presence of domain shifts.
The literature offers various approaches to tackle the stability-plasticity dilemma, which is the underlying cause of catastrophic forgetting in continual learning (CL) [2] within neural networks. Weight-regularization methods, such as LwF [3], address this by imposing explicit constraints on neural network updates through an additional regularization term. This effectively restricts the changes in weights related to previous tasks and maintains consistency through knowledge distillation. While successful in some CL scenarios, these methods are not effective in class-incremental learning scenarios. Parameter-isolation methods, exemplified by PNNs [4], assign a distinct set of parameters to each task to minimize interference. However, these methods face scalability challenges when dealing with a large number of tasks. In contrast, rehearsal-based methods like CLS-ER [5], and DualNet [6] explicitly store and replay a subset of previous task samples alongside the current batch of samples, proving to be the most effective in minimizing interference in challenging CL tasks.
In the context of the approaches mentioned, it's important to note that while model surrogates have been successful in retaining knowledge from previous tasks, they do not align with the mechanisms employed by the human brain. This project aims to push the boundaries by pioneering true continual learning systems that do not rely on task boundaries, rehearsal experiences, or model surrogates, while maintaining robustness and scalability. Within the framework of fixed capacity models, our research will explore innovative strategies for preserving both previous and current task information without being susceptible to catastrophic forgetting. To accomplish this, we will conduct an in-depth exploration of various architectures within CNNs and vision transformers and devise novel methodologies, thereby contributing to the advancement of continual learning technologies.
If this project interests you, please get in touch with Prashant Bhat (p.s.bhat@tue.nl)
References:
[1] Michael McCloskey and Neal J Cohen. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, pp. 109–165. Elsevier, 1989
[2] German I Parisi, Ronald Kemker, Jose L Part, Christopher Kanan, and Stefan Wermter. Continual lifelong learning with neural networks: A review. Neural Networks, 113:54–71, 2019.
[3] Li, Zhizhong, and Derek Hoiem. "Learning without forgetting." IEEE transactions on pattern analysis and machine intelligence 40.12 (2017): 2935-2947.
[4] Andrei A Rusu, Neil C Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, and Raia Hadsell. Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016
[5] Elahe Arani, Fahad Sarfraz, and Bahram Zonooz. Learning fast, learning slow: A general continual learning method based on a complementary learning system. In International Conference on Learning Representations, 2022
[6] Pham, Quang, Chenghao Liu, and Steven Hoi. "Dualnet: Continual learning, fast and slow." Advances in Neural Information Processing Systems 34 (2021): 16131-16144.