Counterfactual Fairness through Transforming Data Orthogonal to Bias
Abstract: Machine learning models have shown exceptional prowess in solving complex issues across various domains. However, these models can sometimes exhibit biased decision-making, resulting in unequal treatment of different groups. Despite substantial research on counterfactual fairness, methods to reduce the impact of multivariate and continuous sensitive variables on decision-making outcomes are still underdeveloped. We propose a novel data pre-processing algorithm, Orthogonal to Bias (OB), which is designed to eliminate the influence of a group of continuous sensitive variables, thus promoting counterfactual fairness in machine learning applications. Our approach, based on the assumption of a jointly normal distribution within a structural causal model (SCM), demonstrates that counterfactual fairness can be achieved by ensuring the data is orthogonal to the observed sensitive variables. The OB algorithm is model-agnostic, making it applicable to a wide range of machine learning models and tasks. Additionally, it includes a sparse variant to improve numerical stability through regularization. Empirical evaluations on both simulated and real-world datasets, encompassing settings with both discrete and continuous sensitive variables, show that our methodology effectively promotes fairer outcomes without compromising accuracy.
- A reductions approach to fair classification. In International conference on machine learning. PMLR, 60–69.
- Learning Optimal and Fair Decision Trees for Non-Discriminative Decision-Making. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence (Honolulu, Hawaii, USA) (AAAI’19/IAAI’19/EAAI’19). AAAI Press, Article 175, 9 pages. https://doi.org/10.1609/aaai.v33i01.33011418
- Removing the influence of group variables in high-dimensional predictive modelling. Journal of the Royal Statistical Society. Series A,(Statistics in Society) 184, 3 (2021), 791.
- What We Can’t Measure, We Can’t Understand: Challenges to Demographic Data Procurement in the Pursuit of Fairness. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event, Canada) (FAccT ’21). Association for Computing Machinery, New York, NY, USA, 249–260. https://doi.org/10.1145/3442188.3445888
- Julia Angwin and Jeff Larson. 2023. Machine Bias. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
- R\\\backslash\’enyi Fair Inference. arXiv preprint arXiv:1906.12005 (2019).
- A Convex Framework for Fair Regression. ArXiv abs/1706.02409 (2017). https://api.semanticscholar.org/CorpusID:12641090
- Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, Berlin, Heidelberg.
- Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems 29 (2016).
- Building classifiers with independency constraints. In 2009 IEEE international conference on data mining workshops. IEEE, 13–18.
- Silvia Chiappa. 2019. Path-specific counterfactual fairness. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 7801–7808.
- Efficient Candidate Screening Under Multiple Tests and Implications for Fairness. In 1st Symposium on Foundations of Responsible Computing, FORC 2020, June 1-3, 2020, Harvard University, Cambridge, MA, USA (virtual conference) (LIPIcs, Vol. 156), Aaron Roth (Ed.). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 1:1–1:20. https://doi.org/10.4230/LIPIcs.FORC.2020.1
- Flexibly fair representation learning by disentanglement. In International conference on machine learning. PMLR, 1436–1445.
- Conscientious classification: A data scientist’s guide to discrimination-aware classification. Big data 5, 2 (2017), 120–134.
- Fairness through Awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (Cambridge, Massachusetts) (ITCS ’12). Association for Computing Machinery, New York, NY, USA, 214–226. https://doi.org/10.1145/2090236.2090255
- Fair machine learning in healthcare: A review. arXiv preprint arXiv:2206.14397 (2022).
- A confidence-based approach for balancing fairness and accuracy. In Proceedings of the 2016 SIAM international conference on data mining. SIAM, SIAM, Miami, Florida, USA, 144–152.
- The case for process fairness in learning: Feature selection for fair decision making. In NIPS symposium on machine learning and the law, Vol. 1. Barcelona, Spain, Curran Associates, Inc., Barcelona, Spain, 11.
- Rui Song Haoyu Chen, Wenbin Lu and Pulak Ghosh. 2023. On Learning and Testing of Counterfactual Fairness through Data Preprocessing. J. Amer. Statist. Assoc. 0, 0 (2023), 1–11. https://doi.org/10.1080/01621459.2023.2186885 arXiv:https://doi.org/10.1080/01621459.2023.2186885
- Equality of Opportunity in Supervised Learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems (Barcelona, Spain) (NIPS’16). Curran Associates Inc., Red Hook, NY, USA, 3323–3331.
- Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and information systems 33, 1 (2012), 1–33.
- Fairness in algorithmic decision making: An excursion through the lens of causality. In The World Wide Web Conference. Association for Computing Machinery, New York; NY; United States, 2907–2914.
- Fair Decisions Despite Imperfect Predictions. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 108), Silvia Chiappa and Roberto Calandra (Eds.). PMLR, 277–287. https://proceedings.mlr.press/v108/kilbertus20a.html
- Counterfactual Fairness. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc., Long Beach, CA, USA. https://proceedings.neurips.cc/paper_files/paper/2017/file/a486cd07e4ac3d270571622f4f316ec5-Paper.pdf
- k-NN as an implementation of situation testing for discrimination discovery and prevention. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. 502–510.
- Multi–objective evolutionary algorithms for the risk–return trade–off in bank loan management. International Transactions in operational research 9, 5 (2002), 583–597.
- Carl M. O’Brien. 2016. Statistical Learning with Sparsity: The Lasso and Generalizations. International Statistical Review 84, 1 (2016), 156–157. https://doi.org/10.1111/insr.12167 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/insr.12167
- Pranita Patil and Kevin Purcell. 2022. Decorrelation-Based Deep Learning for Bias Mitigation. Future Internet 14, 4 (2022). https://doi.org/10.3390/fi14040110
- Judea Pearl. 2009. Causality (2 ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511803161
- Equal opportunity and affirmative action via counterfactual predictions. arXiv preprint arXiv:1905.10870 (2019).
- Assessing Electricity Service Unfairness with Transfer Counterfactual Learning. arXiv preprint arXiv:2310.03258 (2024).
- Linda F. Wightman. 1998. LSAC National Longitudinal Bar Passage Study. LSAC Research Report Series. https://api.semanticscholar.org/CorpusID:151073942
- A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10, 3 (2009), 515–534.
- Fairness Constraints: Mechanisms for Fair Classification. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 54), Aarti Singh and Jerry Zhu (Eds.). PMLR, Sydney, Australia, 962–970. https://proceedings.mlr.press/v54/zafar17a.html
- Learning Fair Representations. In Proceedings of the 30th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 28), Sanjoy Dasgupta and David McAllester (Eds.). PMLR, Atlanta, Georgia, USA, 325–333. https://proceedings.mlr.press/v28/zemel13.html
- Sparse principal component analysis. Journal of computational and graphical statistics 15, 2 (2006), 265–286.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.