back to list
Project: Sustainable Large Language Models
Description
Project description:
Large Language Models (LLMs) are deep-learning models that achieve state-of-the-art performance in many NLP tasks. They typically consist of billions of weights. As a result, expressing weights in float32 leads to models of size at least 1GB. Such large models cannot be easily deployed on edge devices (e.g., smartphones) since they take too much space and consume too much energy. Thus, the problem is to develop and/or apply techniques (e.g., during training or post-training, quantization, new architectures) that lead to sustainable LLMs, namely, power-efficient and/or light-weight LLMs.
In this thesis: (a) you will study the techniques for formulating more sustainable LLMs, (b) you will formulate and code your own sustainable LLMs, (c) you will design and carry out evaluations for your sustainable LLMs.
Literature (examples):
- Chee et al., "QuIP: 2-Bit Quantization of Large Language Models With Guarantees", https://arxiv.org/abs/2307.13304
- Liu et al., “LLM-QAT: Data-Free Quantization Aware Training for Large Language Models”, https://arxiv.org/abs/2305.17888
- Bai et al., “Towards Efficient Post-training Quantization of Pre-trained Language Models”, https://arxiv.org/abs/2109.15082
- Kuzmin et al., “Pruning vs Quantization: Which is Better?”, https://arxiv.org/abs/2307.02973
- Park et al., “Quadapter: Adapter for GPT-2 Quantization”, https://arxiv.org/abs/2211.16912
- Nagel et al., “Up or Down? Adaptive Rounding for Post-Training Quantization”, https://arxiv.org/abs/2004.10568
- Tomczak, “Deep Generative Modeling”, https://link.springer.com/book/10.1007/978-3-030-93158-2
Prerequisites:
- reading and understanding scientific literature
- very good coding skills in Python using PyTorch and other ML libraries
- good knowledge of Deep Learning and the basics of Generative AI
- curious attitude, independence, thinking out-of-the box
Details
- Student
-
DH
Dalton Harmsen
- Supervisor
-
Jakub Tomczak