Data generation is an important task, but typically the missing data mechanism is not fully modeled and exploited in the process. This project intends to study such a problem and to create tools for data generation with missing values. Besides data generation from random distributions, we will consider how to learn a model and generate data from it, which can be used for hiding the original data and for improving models by post-hoc fine tuning. We will focus on probabilistic generative machine learning models.
Cassio de Campos