back to list

Project: Efficient Unbiased Training of Large-scale Distributed RL


It is widely known that training deep neural networks on huge datasets improves learning. However, huge datasets and deep neural networks can no longer be trained on a single machine. One common solution is to train using distributed systems. In addition to traditional data-centers, in federated learning,  multiple clients, e.g., a few hospitals and thousands of cellphones learn a model without sharing local data to prevent the potential privacy issues.

Several methods have been proposed to accelerate training for classical empirical risk minimization (ERM) in supervised learning and beyond such as gradient (or model update) compression, gradient sparsification, weight quantization/sparsification, and reducing the frequency of communication though multiple local updates. Unbiased vector quantization is in particular an interesting compression method due to both enjoying strong theoretical guarantees along with providing communication efficiency  on the fly, i.e., it converges under the same hyperparameteres tuned for vanilla uncompressed SGD while providing substantial savings in terms of communication costs [1-4].

In this project, we investigate how to accelerate training deep neural networks in distributed reinforcement learning (DRL) [5-10]. In particular, our goal is to show: 1) How can we modify adaptive variants of unbiased quantization schemes tailored to general DRL problems; 2) Can we achieve optimal rate of convergence while establishing strong guarantees on the number of communication bits? 3) Do our new methods show strong empirical performance on deep neural networks and huge datasets, both in terms of performance measures and scalability?

This project is available for a master student with a strong background in reinforcement learning. Students should be familiar with reinforcement learning, PyTorch. Familiarity with distributed optimization, MPI, and CUDA is a plus.

[1] Dan Alistarh, Demjan Grubic, Jerry Z. Li, Ryota Tomioka, and Milan Vojnovic. QSGD: Communication-efficient SGD via gradient quantization and encoding. In Proc. NeurIPS, 2017.

[2] Fartash Faghri, Iman Tabrizian, Ilia Markov, Dan Alistarh, Daniel M. Roy, and Ali  Ramezani-Kebrya. Adaptive gradient quantization for data-parallel SGD. In Proc. NeurIPS, 2020.

[3] Ali Ramezani-Kebrya, Fartash Faghri, Ilya Markov, Vitalii Aksenov, Dan Alistarh, and Daniel M. Roy. NUQSGD: Provably communication-efficient data-parallel SGD via nonuniform quantization. JMLR, 22(114):1–43, 2021.

[4] Ali Ramezani-Kebrya, Kimon Antonakopoulos, Igor Krawczuk, Justin Deschenaux, and Volkan Cevher, Distributed Extra-gradient with Optimal Complexity and Communication Guarantees, to appear at ICLR 2023.

[5] Drew Bagnell and Andrew Ng. On local rewards and scaling distributed reinforcement learning. In Proc. NeurIPS, 2005.

[6] Eric Liang, Richard Liaw, Robert Nishihara, Philipp Moritz, Roy Fox, Ken Goldberg, Joseph Gonzalez, Michael Jordan, Ion Stoica. RLlib: Abstractions for distributed reinforcement learning. In Proc. ICML, 2018.

[7] Xiaofeng Fan, Yining Ma, Zhongxiang Dai, Wei Jing, Cheston Tan, and Bryan Kian Hsiang Low. Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee. In Proc. NeurIPS, 2021.

[8] Srivatsan Krishnan, Maximilian Lam, Sharad Chitlangia, Zishen Wan, Gabriel Barth-Maron, Aleksandra Faust, and Vijay Janapa Reddi. QuaRL: Quantization for sustainable reinforcement learning. arXiv:1910.01055, 2021.

[9] Srivatsan Krishnan, Maximilian Lam, Sharad Chitlangia, Zishen Wan, Gabriel Barth-Maron, Aleksandra Faust, and Vijay Janapa Reddi. Settling the communication complexity for distributed offline reinforcement learning. arXiv:1910.01055, 2022.

[10] Sajad Khodadadian, Pranay Sharma, Gauri Joshi, and Siva Theja Maguluri. Federated reinforcement learning: Linear speedup under Markovian sampling. In ICML, 2022.

Maryam Tavakol
Secondary supervisor
Ali Ramezani-Kebrya
External location
University of Oslo
Get in contact