Data and AI cluster

Project: Race Car Tuning as Configurable Reinforcement Learning

Description

Deep reinforcement learning has been successfully used for driving a car in the Gran Turismo video game, outperforming experts (Wurman et al., 2022). However, an open question remains: how to tune the car used during the races?

This problem can be modeled as a configurable MDP (Metelli et al., 2018), where we can change some of the problem's parameters before an episode (race) starts. In this project, we study how to find the parameters that allow the RL agents to reach the best performance.

Recently, curriculum learning has been used to generate a sequence of tasks that speed up the training of the RL agents (Narvekar et al., 2020). Unsupervised environment design (Li et al., 2025) uses a similar approach in which the teacher proposes configurations for the underlying agent to learn how to operate in arbitrary configurations.

The challenge of this project is to find a way to improve the sequence without a predefined target parameter.

References

Li, D., Li, W., and Varakantham, P. (2025). Marginal benefit driven RL teacher for unsupervised environment design. Aaai, 18253–18261.
Metelli, A. M., Mutti, M., and Restelli, M. (2018). Configurable Markov decision processes. Icml, 3488–3497.
Narvekar, S., Peng, B., Leonetti, M., Sinapov, J., Taylor, M. E., and Stone, P. (2020). Curriculum learning for reinforcement learning domains: A framework and survey. Jmlr, 21, 181:1-181:50.
Wurman, P. R., Barrett, S., Kawamoto, K., MacGlashan, J., Subramanian, K., Walsh, T. J., Capobianco, R., Devlic, A., Eckert, F., Fuchs, F., Gilpin, L., Khandelwal, P., Kompella, V. R., Lin, H., MacAlpine, P., Oller, D., Seno, T., Sherstan, C., Thomure, M. D., … Kitano, H. (2022). Outracing champion Gran Turismo drivers with deep reinforcement learning. Nat, 602(7896), 223–228.

Photo by Rakesh Sitnoor on Unsplash.

Details

Supervisor: Thiago Simão
Interested?: Get in contact