Data and AI cluster

Project: A feasibility study on automated database exercise generation with large language models

Description

Your lecturers here at the university spend a lot of time creating new exercises for our students, both for weekly assignments as for exams. If you extrapolate this to universities and professional training globally, this is a tremendous effort and use of time. It would be beneficial for lecturers and trainers if we could have a tool generating such assignments.

In a recent study by Sarsa et al., they generated programming exercises and code explanations with the use of large language models. They found that the exercises were typically correct, and if they were not, this was easy to detect with some basic testcases.

In this project you will analyze whether it would be possible to create database schemas and SQL query formulation problems with the use of large language models. Some tasks in this project include exercise analysis, prompt design and writing of tests.

Due to this solution’s projected savings in effort and time, this work has the potential to lead to widespread adoption in education, both on an academic and industrial level.

Further reading:

Sarsa, S., Denny, P., Hellas, A., & Leinonen, J. (2022, August). Automatic Generation of Programming Exercises and Code Explanations Using Large Language Models. In Proceedings of the 2022 ACM Conference on International Computing Education Research-Volume 1 (pp. 27-43). pdf
CodeX
PolyCoder

Details

Student: WA
Willem Aerts
Supervisor: George Fletcher
Secondary supervisor: Daphne Miedema