back to list

Project: Overcoming data scarcity in visual object detection and recognition tasks with frugal learning



The Observe, Orient, Decide and Act (OODA) loop [1] shapes most modern military warfare doctrines. Typically, after gathering sensor and intelligence data in the Observe step, a common tactical operating picture of the monitored aerial, maritime and/or ground scenario is built and shared among units of cooperative command forces in the Orient step. As prescribed by the Joint Directors of Laboratories (JDL) data fusion model [2], this common picture typically encompasses object-refinement information – e.g. kinematic state estimates and identity of surveilled entities overlaid over cartographic maps –, situation-refinement information – e.g. identification of threats – and threat assessment information – e.g. impact of threats activities. This comprehensive perception of the current status of the battlefield across space and towards its future status given potential courses of actions, a.k.a. situational awareness, is a key element for most military applications since it allows commanders to make informed decisions in the Decide step on which actions to take in the Act step to e.g. preserve the peace or defeat an enemy given the current status of the battlefield and the most likely course of actions taken by opposing forces.

In this context, ML-based methods are increasingly enhancing the robustness and effectiveness of low-level, object-refinement data fusion methods to e.g. detect & recognize vessels in synthetic SAR images [3] as well as high-level, situation-refinement data fusion methods to e.g. properly identify air combat threats [4] or airspace safety hazards [5]. Note though that well-succeed ML models employed in military applications often rely on massive gathering of intelligence data for proper model training or fine-tuning. On the other hand, given the nature of war games, the availability of real-life data is frequently severely limited to the point where data-intensive, deep-learning techniques cannot be used or, as illustrated in the figure above, may fail to properly classify the targets of particular classes with few training samples.

The goal of this project is therefore to investigate the potential of frugal learning techniques to deal with scarce learning resources, mainly in terms of the availability of training data, and explore these techniques to meet the needs of a relevant use case (military or not) related to object detection and recognition using imagery data sources. As an explorative assignment, the student has freedom to investigate and combine frugal learning techniques such as i) knowledge graph representations [6], ii) conditional generative models using GANs [7,8, 9], VAEs [10] or invertible flow [11], iii) generative models using transfer of style [12]; iv) multimodal embedding [13, 14, 15, 16, 17, 18], v) unsupervised domain adaptation using simulated data [19, 20, 21, 22] and/or vi) active learning [23, 24, 25]. Finally, the student has also freedom to help discussing and defining the specific (military or not) use case with potential stakeholders.


You must be willing to understand a relevant (military or not) use case and translate it into a concrete frugal learning problem constrained by clearly defined application requirements. You must also be willing to think conceptually and theoretically about the frugal learning field.


[1] John Boyd. Ooda loop. Center for Defense Information, Tech. Rep, 1995.

[2] Llinas, James, et al. Revisiting the JDL data fusion model II. Space And Naval Warfare Systems Command San Diego CA, 2004.

[3] C. Connors, T. Scarnati and G. Harris, "Joint Image Formation and Target Classification of SAR Images," 2021 IEEE Radar Conference (RadarConf21), 2021, pp. 1-6, doi: 10.1109/RadarConf2147009.2021.9455246.

[4] X. Ximeng, Y. Rennong and Y. Yang, "Threat Assessment in Air Combat Based on ELM Neural Network," 2019 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), 2019, pp. 114-120, doi: 10.1109/ICAICA.2019.8873461.

[5] B. Taha and A. Shoufan, "Machine Learning-Based Drone Detection and Classification: State-of-the-Art in Research," in IEEE Access, vol. 7, pp. 138669-138682, 2019, doi: 10.1109/ACCESS.2019.2942944.

[6] Marino, K., et al., 2017. The More You Know: Using Knowledge Graphs for Image Classification. arXiv:1612.04844.

[7] Bucher, M., Herbin, S., Jurie, F., 2017. Generating Visual Representations for Zero-Shot Classification, in: TASK-CV ICCV Workshops.

[8] Xian, Y., et al., 2018. Feature generating networks for zero-shot learning, CVPR.

[9] Mishra, A., et al., 2018. A generative model for zero shot learning using conditional variational autoencoders, CVPRW.

[10] Mishra, A., et al., 2018. A generative model for zero shot learning using conditional variational autoencoders, CVPRW.

[11] Shen, Y., et al., 2020. Invertible zero-shot recognition flows, in: European Conference on Computer Vision. pp. 614–631.

[12] Karras, Tero, Samuli Laine, and Timo Aila. "A style-based generator architecture for generative adversarial networks." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.

[13] Romera-Paredes, et al., 2015. An embarrassingly simple approach to zero-shot learning, ICML.

[14] Bucher, M., Herbin, S., Jurie, F., 2016a. Hard Negative Mining for Metric Learning Based Zero-Shot Classification, ECCV-W 2016.

[15] Bucher, M., et al., 2016b. Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classification, ECCV.

[16] Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C., 2015. Label-embedding for image classification. PAMI 38, 1425–1438.

[17] Jiang, H., Wang, R., Shan, S., Chen, X., 2019. Transferable contrastive network for generalized zero-shot learning, ICCV.

[18] Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B., 2016. Latent embeddings for zero-shot classification, CVPR.

[19] Csurka, G., 2017. Domain adaptation for visual applications: A comprehensive survey. arXiv preprint arXiv:1702.05374.

[20] Wang, M., Deng, W., 2018. Deep visual domain adaptation: A survey. Neurocomputing 312, 135–153.

[21] Wilson, G., Cook, D.J., 2020. A Survey of Unsupervised Deep Domain Adaptation. ACM Trans. Intell. Syst. Technol. 11.

[22] Zhou, K., Liu, Z., Qiao, Y., Xiang, T., Loy, C.C., 2021. Domain Generalization: A Survey. arXiv:2103.02503.

[23] Settles, B., 2012. Active learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 6, 1–114.

[24] Fürnkranz, J., Hüllermeier, E., 2011. Preference Learning and Ranking by Pairwise Comparison.

[25] González, J., Dai, Z., Damianou, A., Lawrence, N.D., 2017. Preferential bayesian optimization, ICML.

Mykola Pechenizkiy
Secondary supervisor
Stiven Schwanz Dias
Get in contact