Data and AI cluster

Project: Adaptive Inference Time in Neural Networks with Early-exit Neural Networks & Uncertainty Quantification

Description

Increasing a neural network's architecture size can significantly improve performance, but it also makes inference slow and resource-intensive. Early-exit neural networks (EENNs) address this by adding extra classifiers at intermediate layers, so that easy inputs can exit early and save computation. The key question is: when is it "safe" for an EENN to go "fast" without hurting overall performance?

Most EENN methods use a simple softmax confidence threshold at each exit layer: if the top class probability exceeds a set value, they stop and output that prediction. In practice, however, softmax scores are typically poorly calibrated: a 70% confidence does not actually mean a 70% chance of being correct [1]. Moreover, softmax cannot tell whether the model is uncertain because the data are noisy (aleatoric uncertainty) or because the model itself is unsure (epistemic uncertainty). As a result, EENNs risk exiting too early on ambiguous or out-of-distribution inputs, hurting accuracy, or exiting too late on easy inputs, losing the benefit of faster inference [2].

In recent years, lightweight uncertainty quantification (UQ) methods, such as Evidential Deep Learning [3], Deep Deterministic Uncertainty [4], Spectral-normalized Gaussian Process (SNGP) [5], or Laplace approximations [6], have improved both the reliability of confidence estimates and their efficiency. These techniques provide better-calibrated uncertainty without adding much computational overhead compared to a conventional neural network.

Although this project is open-ended, one possible direction is to explore how to integrate UQ methods into an EENN's intermediate exits. Instead of relying on softmax, each exit would use a UQ-based score to decide when to stop. The goal is to make exit decisions more robust for a better balance between speed and accuracy. A second plausible direction, which is more theoretical, is to analyze the estimated uncertainty of the models across the various layers and study them from the bias-variance tradeoff.

References
[1] [arxiv.org/pdf/1706.04599](https://arxiv.org/pdf/1706.04599)
[2] [arxiv.org/pdf/1709.01686](https://arxiv.org/pdf/1709.01686)
[3] [arxiv.org/pdf/1806.01768](https://arxiv.org/pdf/1806.01768)
[4] [arxiv.org/pdf/2003.02037](https://arxiv.org/pdf/2003.02037)
[5] [arxiv.org/pdf/2006.10108](https://arxiv.org/pdf/2006.10108)
[6] [arxiv.org/pdf/2010.02720](https://arxiv.org/pdf/2010.02720)

Details

Supervisor: Sibylle Hess
Secondary supervisor: Fabian Denoodt
Interested?: Get in contact