Data and AI cluster

Project: Creating Fairness-Aware Datasets for Sequential Decision-Making

Description

As AI systems become more integrated into decision-making across domains such as finance, healthcare, and criminal justice, ensuring fairness has become a key concern. Fairness-aware machine learning (ML) aims to mitigate biases that could lead to discriminatory outcomes, but traditional research often focuses on static settings, where each decision is independent of others. However, many real-world applications, such as patient treatment planning, require sequential decision-making, which can be better modeled using Markov Decision Processes (MDPs) and Reinforcement Learning (RL).

The exploration of fairness-aware learning in sequential settings remains relatively new and challenging. A significant step toward advancing this area is to adapt existing fairness-related datasets for MDPs, making it easier to study fairness in RL-based applications. This project aims to bridge the gap between fairness-aware learning and sequential decision-making by converting these datasets for use in MDP environments such as this paper.

Task description:

Conduct a comprehensive literature review of fairness-aware ML, focusing on fairness metrics, strategies for bias mitigation, and how these have been applied in static and sequential settings.
Learn about the key concepts of MDPs and RL.
Identify and collect publicly available datasets that are commonly used in fairness research (e.g., COMPAS, Adult Income, etc.), and evaluate these datasets in terms of their suitability for sequential decision-making contexts and the types of fairness metrics they support (e.g., demographic parity, equalized odds).
Properly define all the MDP components to ensure meaningful and interpretable simulations.
Develop a software program that transforms the collected datasets into a sequential format compatible with MDPs, and implement the transformation to allow each dataset to be used in MDP-based simulations.
Integrate various fairness metrics into the program, allowing users to specify their chosen metrics when converting datasets, making the resulting MDPs to support various fairness evaluation goals.
Evaluate the usability and accuracy of the transformed datasets within sample MDP/RL simulations to validate that they are correctly adapted for sequential fairness analysis
Provide comprehensive documentation on using the dataset transformation program.

References:

Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635, 2019.
Solon Barocas and Andrew D Selbst. Big data’s disparate impact. California Law Review, 104(3):671, 2016.
Pratik Gajane and Mykola Pechenizkiy. On formalizing fairness in prediction with machine learning. arXiv preprint arXiv:1710.03184, 2017.
Moritz Hardt, Eric Price, Nati Srebro, et al. Equality of opportunity in supervised learning. In Advances in neural information processing systems, pages 3315–3323, 2016.
Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In Advances in Neural Information Processing Systems, pages 4066–4076, 2017.
Sahil Verma and Julia Rubin. Fairness definitions explained. In 2018 IEEE/ACM International Workshop on Software Fairness (FairWare), pages 1–7. IEEE, 2018.
Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 35–50. Springer, 2012.
Xingyu Chen, Brandon Fain, Liang Lyu, and Kamesh Munagala. Proportionally fair clustering. In International Conference on Machine Learning, pages 1032–1041, 2019.
Shikha Bordia and Samuel Bowman. Identifying and reducing gender bias in word-level language models. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 7–15, 2019.
Hanchen Wang, Nina Grgic-Hlaca, Preethi Lahoti, Krishna P Gummadi, and Adrian Weller. An empirical study on learning fairness metrics for compas data with human supervision. arXiv preprint arXiv:1910.10255, 2019.
Mikhail Yurochkin, Amanda Bower, and Yuekai Sun. Training individually fair ml models with sensitive subspace robustness. In International Conference on Learning Representations, 2019.
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P Gummadi. Fairness constraints: Mechanisms for fair classification. In Artificial Intelligence and Statistics, pages 962–970, 2017.
Naman Goel, Mohammad Yaghini, and Boi Faltings. Non-discriminatory machine learning through convex fairness criteria. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
Alekh Agarwal, Alina Beygelzimer, Miroslav Dudik, John Langford, and Hanna Wallach. A reductions approach to fair classification. In International Conference on Machine Learning, pages 60–69, 2018.
Maryam Tavakol. Fair classification with counterfactual learning. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2073–2076, 2020.
Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, and Aaron Roth. Fairness in reinforcement learning. In International Conference on Machine Learning, pages 1617–1626. PMLR, 2017.
Sampath Kannan, Aaron Roth, and Juba Ziani. Downstream effects of affirmative action. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pages 240–248, 2019.
Lydia T Liu, Sarah Dean, Esther Rolf, Max Simchowitz, and Moritz Hardt. Delayed impact of fair machine learning. In International Conference on Machine Learning, pages 3150–3158. PMLR, 2018.
Alexander D’Amour, Hansa Srinivasan, James Atwood, Pallavi Baljekar, D Sculley, and Yoni Halpern. Fairness is not static: deeper understanding of long term fairness via simulation studies. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pages 525–534, 2020.
MinWen, Osbert Bastani, and Ufuk Topcu. Algorithms for fairness in sequential decision making. In International Conference on Artificial Intelligence and Statistics, pages 1144–1152. PMLR, 2021.
Xueru Zhang, Ruibo Tu, Yang Liu, Mingyan Liu, Hedvig KjellstrÅNom, Kun Zhang, and Cheng Zhang. How do fair decisions fare in long-term qualification? In Thirty-fourth Conference on Neural Information Processing Systems, 2020.
Google. Ml-fairness-gym. google/ml-fairness-gymonGitHub, 2020.
Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Wenbin Zhang, and Eirini Ntoutsi. "A survey on datasets for fairness-aware machine learning." Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 12, no. 3 (2022): e1452.

Details

Supervisor: Maryam Tavakol
Interested?: Get in contact