Project: Architectural Analysis of Vision Transformers in Continual Learning


Deep neural networks (DNN) deployed in the real world are frequently exposed to non-stationary data distributions and required to sequentially learn multiple tasks. This requires that DNNs acquire new knowledge while retaining previously obtained knowledge. However, continual learning in DNNs, in which networks are trained in a sequence of tasks, results in catastrophic forgetting of previously learned information. Therefore, to combat this, a variety of approaches have been proposed for convolutional neural networks (CNN) [ASZ22, BZA22, SAZ22].

On the other hand, the recent breakthrough of vision transformers (VTs) and their compelling performance in different vision tasks present them as an alternative architectural paradigm. Due to their global receptive field, self-attention-based transformer architectures have a unique advantage over CNNs in terms of robustness and generalizability [JKV22]. However, VTs struggle in the low training data regime [TCD21]. Thus, various architectural modifications have been proposed to incorporate convolutional biases and increase data efficiency in VTs [WXZ22, WXC21, dTL21, YCW21].

Most of the research has focused on developing effective training methodologies to combat catastrophic forgetting in CNN and VT. However, only a few works have investigated the effect of architectural choices on lifelong learning. Moreover, these studies only investigate the effect of CNN architecture choices, such as batchnorm layer and network depth, on catastrophic forgetting [MCY22, PLH22]. In addition to these, there are a plethora of other design choices (e.g., different types of attention, positional embedding, and token embedding) in VTs. Therefore, our aim is to conduct a comprehensive study on the impact of architectural choices in VTs on various aspects of continual learning, including catastrophic forgetting, plasticity to learn new tasks, task recency bias, etc.


Elahe Arani
Secondary supervisor
Bahram Zonooz
