back to list

Project: Multimodal Time Series Analysis for Out-of-Distribution Generalization

Description

Introduction

Multimodal time series analysis is an emerging field that leverages multiple data sources to enhance predictive modeling. In retail, time series data often includes numerical sales records and image-based product information, providing a rich dataset for forecasting demand and understanding customer behavior. However, current models trained on historical data often struggle with out-of-distribution (OOD) generalization leading to poor performance when encountering novel products (cold-start problem), trends, or external disruptions. Most of prior approaches ignored mult-modal data for time series analysis. This project aims to develop and evaluate machine learning methods to improve OOD generalization in multimodal time series forecasting.

Research Objectives

* Develop multimodal machine learning models that integrate numerical time series (e.g., sales data) with image data (e.g., product image).

* Address the OOD generalization problem by proposing novel domain adaptation or generalisation techniques.

* Investigate how different representations of visual and numerical data contribute to model performance.

* Evaluate models using real-world retail datasets provided by an industry partner.

Methodology

* Data collection: Data will be provided by an industry partner. (You will have the opportunity to work with the company)

* Model Development: Design and train multimodal deep learning models that fuse numerical and image-based representations. Possible architectures include RNN, CNN, and attention-based models such as transformers.

* OOD Generalization Strategies: Investigate domain adaptation and generalisation techniques such as data augmentation, self-supervised learning, and meta-learning to improve robustness to unseen distributions.

* Evaluation: Standard forecasting metrics (MAE, RMSE) and additional domain generalization metrics will be used.

Different possible research objectives can be discussed and chosen, dependent on the interest of the student. The started date is flexible but earlier is better. For a successful thesis, there will be the possibility to work with the supervisor on a publication.


Literatures:

[1] Daswani, Mayank, et al. "Plots Unlock Time-Series Understanding in Multimodal Models." arXiv preprint arXiv:2410.02637 (2024).

[2] Lee, Geon, et al. "MoAT: Multi-Modal Augmented Time Series Forecasting." (2023).

[3] Liu, Chengzhi, et al. "MTSA-SNN: A Multi-modal Time Series Analysis Model Based on Spiking Neural Network." International Conference on Pattern Recognition. Cham: Springer Nature Switzerland, 2024.

[4] Zhang, Xingxuan, et al. "On the out-of-distribution generalization of multimodal large language models." arXiv preprint arXiv:2402.06599 (2024).

[5] Wang, Jindong, et al. "Generalizing to unseen domains: A survey on domain generalization." IEEE transactions on knowledge and data engineering 35.8 (2022): 8052-8072.


If you are interested, please contact:

* Supervisor: Dr. Deng

* Email: s.deng@tue.nl 

* Office: MF 7.145

Details
Student
RR
Ranee
Supervisor
Amy Deng