back to list

Project: Identifying robust instances in classification


In a classification task, some instances are classified more robustly than others. Namely, even with a large modification of the training set, these instances (in the test set) will be assigned to the same class. Other instances are non-robust in the sense that a small change in the training set will change the assigned class. (There can be various types of modifications to the training set: random vs. adversarial, additions, deletions, ε-perturbation of numerical features, etc.)

Credal classifiers, such as the naive credal classifier, assign a set of classes instead of just a single class (unless there is sufficient information to support a single class). There are many interesting questions at the interface between credal classifiers and robust instance detection. Two examples:

  • Can credal classifiers be used to detect robust instances? Namely, are instances for which the credal classifier assigns a single class robust?
  • During the learning of a credal classifier, there is often one or more parameters that determine its determinacy, i.e., the tendency to return single classes. Can the parameters of a credal classifier be chosen in such a way that it correctly detects robust instances for a given magnitude and nature of possible modifications in the training set (number of additions/deletions, value of ε, random vs. adversarial)?

This project gives you the opportunity of getting familiar with two somewhat separate research lines: based and not based on robust uncertainty models. You can bring these lines together.

Requirements & Activities

This project will require you to learn about a field you are not yet familiar with (imprecise probability theory, for credal classifiers), but also go beyond that to find other approaches to dealing with robustness in classification. It also requires you to be able to design and implement robustness tests, possibly including implementation of some models and algorithms; this means you must have some experience with this kind of activity as a programmer. For the test cases, you will also need to collect an appropriately diverse set of test cases (data sets).

Erik Quaeghebeur