back to list

Project: Diversifying attention through randomization in sparse neural networks training

Description

Context of the work:

Deep Learning (DL) is a very important machine learning area nowadays and it has proven to be a successful tool for all machine learning paradigms, i.e., supervised learning, unsupervised learning, and reinforcement learning. Still, the scalability of DL models is limited due to many redundant connections in densely connected artificial neural networks. Stemming from our previous work [1], the emerging field of sparse training suggests that by training sparse neural networks directly from scratch can lead to better performance than dense training, while having reduced computational costs [2] and, implicitly, reduced energy consumption [5] and low environmental impact. 

Short description of the assignment:

As a new field, sparse training has many open questions. The goal of this assignment is to study if by using random processes such as in [3] we can improve the sparse training drop and grow phases from [4] in order to adapt the latter to learn very well and super quickly in typical supervised learning settings. 

Possible expected outcomes:

algorithmic novelty, open-source software, publishable results. 

Requirements:

Basic Calculus and Optimization

Very good programming skills

Good understanding of artificial neural networks

Learning Objectives:

Upon successful completion of this project, the student will have learnt:

How to address a basic research question

Fundamental concepts behind sparse neural networks

Practical skills to implement artificial neural networks and to create an open-source software product 

Examples of previous MSc thesis on the topic:

https://people.utwente.nl/d.c.mocanu?tab=education 

References: 

[1] D.C. Mocanu, E. Mocanu, P. Stone, P.H. Nguyen, M. Gibescu, A. Liotta: “Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science”, Nature Communications, 2018, https://arxiv.org/abs/1707.04780  

[2] D.C. Mocanu, E. Mocanu, T. Pinto, S. Curci, P.H. Nguyen, M. Gibescu, D. Ernst, Z.A. Vale: “Sparse Training Theory for Scalable and Efficient Agents”, AAMAS 2021, https://arxiv.org/abs/2103.01636 

[3] Z. Atashgahi, J. Pieterse, S. Liu, D.C. Mocanu, R. Veldhuis, M. Pechenizkiy: “A Brain-inspired Algorithm for Training Highly Sparse Neural Networks”, Machine Learning (ECMLPKDD 2022 journal track), https://arxiv.org/abs/1903.07138

[4] G. Sokar, Z. Atashgahi, M. Pechenizkiy, D.C. Mocanu: ”Where to Pay Attention in Sparse Training for Feature Selection? ”, NeurIPS 2022, the came-ready version can be provided by email at request.

[5] Z. Atashgahi, G. Sokar, T. van der Lee, E. Mocanu, D.C. Mocanu, R. Veldhuis, M. Pechenizkiy: “Quick and Robust Feature Selection: the Strength of Energy-efficient Sparse Training for Autoencoders”, Machine Learning (ECMLPKDD 2022 journal track), https://arxiv.org/abs/2012.00560 


Details
Supervisor
Mykola Pechenizkiy
Secondary supervisor
Ghada Sokar
Interested?
Get in contact