Data and AI cluster

Project: Input Adaptive Inference for Semantic Segmentation

Description

Neural networks typically consist of a sequence of well-defined computational blocks that are executed one after the other to obtain an inference for an input image. After the neural network has been trained, a static inference graph comprising these computational blocks is executed for all test images. Alternatively, Dynamic Neural Networks are designed to adapt their inference graph based on the input image. Essentially, dynamic networks leverage the fact that some images are easier than others and can be inferred with less computation. Therefore, on a set of test images, dynamic networks reduce the average computation and can provide other advantages, such as improved representation power [1].

Most existing works on dynamic neural networks address the task of image classification where difficulty is considered to vary at the image level. The width or depth of the network dynamically changes based on the input image. For semantic segmentation, difficulty can also be considered to vary between image regions. Existing works address dynamic inference for semantic segmentation using different methods, of which a subset of methods along with their drawbacks are listed:

Multi-exit architecture: [4] proposes a multi-exit network called Layer Cascade where the network stops further processing pixels that are predicted with high confidence at an exit. These methods could force early layers to learn more abstract information [3] which could disrupt learning hierarchical representations.
Dynamic routing: [5] extends dynamic routing to semantic segmentation, where the network is trained to adapt a particular path in a routing space of L layers between a few fixed encoder layers and the decoder. However, routing does not exploit the varying difficulty across image regions.
Patch-based: [2] uses the entropy of segmentation predictions from a fast network to access difficulty. The image is split into fixed regions, and those regions with more entropy are sent through a large accurate network. SegBlocks [6] also divides the image into fixed patches. A policy network predicts whether or not to process a patch in high resolution. These methods are sensitive to the procedure by which the image is split into patches.

The thesis could first focus on extensively reviewing these methods and others [7] for semantic segmentation. Optionally, the review could be extended to other dense prediction tasks. The thesis could then propose and evaluate a new method that provides a new perspective or builds on one or more of the existing methods.

References
[1] Yizeng Han, Gao Huang, Shiji Song, Le Yang, Honghui Wang, and Yulin Wang. Dynamic neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.

[2] Yu-Hui Huang, Marc Proesmans, Stamatios Georgoulis, and Luc Van Gool. Uncertainty based model selection for fast semantic segmentation. In 2019 16th International Conference on Machine Vision Applications (MVA), 2019.

[3] Alexandros Kouris, Stylianos I. Venieris, Stefanos Laskaridis, and Nicholas D. Lane. Multi-exit semantic segmentation networks, 2021.

[4] Xiaoxiao Li, Ziwei Liu, Ping Luo, Chen Change Loy, and Xiaoou Tang. Not all pixels are equal: Difficulty-aware semantic segmentation via deep layer cascade. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

[5] Yanwei Li, Lin Song, Yukang Chen, Zeming Li, Xiangyu Zhang, Xingang Wang, and Jian Sun. Learning dynamic routing for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

[6] Thomas Verelst and Tinne Tuytelaars. Segblocks: Towards block-based adaptive resolution networks for fast segmentation. In Adrien Bartoli and Andrea Fusiello, editors, Computer Vision – ECCV 2020 Workshops.

[7] Bowen Zhang, Zhi Tian, Chunhua Shen, et al. Dynamic neural representational decoders for high-resolution semantic segmentation. Advances in Neural Information Processing Systems, 2021.

Details

Supervisor: Bahram Zonooz
Secondary supervisor: Elahe Arani
Interested?: Get in contact