There are an infinite number of ways to design a machine learning system, and many careful decisions need to be made based on prior experience. The field of automated machine learning (AutoML) aims to make these decisions in a data-driven, objective, and automated way.
There exist a range of AutoML tools (e.g. AutoGluon-Tabular, Auto-sklearn, H2O Autopilot, GAMA, TPOT,...). Many of these systems, however, expect rather clean data and can easily break when the data has certain imperfections. For instance, they will try a one-hot-encoder on a categorical feature with 1000s of categories, exploding the feature space and crash/hang. Also, many don't handle string features well and will therefore obtain very suboptimal results.
Consider the following challenge: you are given any (tabular) dataset, e.g. from OpenML or Kaggle, and an associated task (e.g. classification), and your AutoML bot has to find reasonably good models without crashing. You can do anything you find reasonable to achieve this. Some suggestions:
You can evaluate your AutoML bot by pitting it against existing AutoML systems on a set of tricky datasets. You don't have to develop this from scratch, you can build on GAMA, an extensible AutoML tool developed in our group.