Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness

Published 14 Nov 2017 in cs.LG, cs.DS, and cs.GT | (1711.05144v5)

Abstract: The most prevalent notions of fairness in machine learning are statistical definitions: they fix a small collection of pre-defined groups, and then ask for parity of some statistic of the classifier across these groups. Constraints of this form are susceptible to intentional or inadvertent "fairness gerrymandering", in which a classifier appears to be fair on each individual group, but badly violates the fairness constraint on one or more structured subgroups defined over the protected attributes. We propose instead to demand statistical notions of fairness across exponentially (or infinitely) many subgroups, defined by a structured class of functions over the protected attributes. This interpolates between statistical definitions of fairness and recently proposed individual notions of fairness, but raises several computational challenges. It is no longer clear how to audit a fixed classifier to see if it satisfies such a strong definition of fairness. We prove that the computational problem of auditing subgroup fairness for both equality of false positive rates and statistical parity is equivalent to the problem of weak agnostic learning, which means it is computationally hard in the worst case, even for simple structured subclasses. We then derive two algorithms that provably converge to the best fair classifier, given access to oracles which can solve the agnostic learning problem. The algorithms are based on a formulation of subgroup fairness as a two-player zero-sum game between a Learner and an Auditor. Our first algorithm provably converges in a polynomial number of steps. Our second algorithm enjoys only provably asymptotic convergence, but has the merit of simplicity and faster per-step computation. We implement the simpler algorithm using linear regression as a heuristic oracle, and show that we can effectively both audit and learn fair classifiers on real datasets.

Abstract PDF Upgrade to Chat

Citations (736)

View on Semantic Scholar

Summary

The paper presents a two-player zero-sum game framework between a Learner and an Auditor to enforce subgroup fairness in classifiers.
The research develops FTPL and Fictitious Play algorithms that demonstrate theoretical convergence and practical efficiency in mitigating fairness gerrymandering.
Empirical evaluations on the Communities and Crime dataset show that randomized classifiers can balance error and subgroup unfairness effectively.

Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness

The paper "Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness" (1711.05144) addresses potential shortcomings in common fairness definitions used in machine learning, such as statistical parity and equal opportunity constraints. It introduces new perspectives on subgroup fairness and offers rigorous exploration of solutions through auditing and learning, thereby addressing the problem of fairness gerrymandering in classifiers—a situation where apparent fairness on individual groups masks violations within structured subgroups.

Statistical and Individual Fairness

Statistical fairness notions usually enforce approximate parity of a chosen statistic (false positives/negatives, positive classification rate) across pre-defined demographic groups (e.g., race or gender-specific groups). While these definitions are useful because they do not assume characteristics about the population distribution, they are susceptible to fairness gerrymandering. This is when classifiers appear fair on major protected groups but violate fairness on certain combinations of protected attribute values, as illustrated by the given example where a classifier appears to satisfy parity for individual protected attributes but discriminates based on a combination of attributes (Figure 1).

Figure 1: Evolution of the error and unfairness of Learner's classifier across iterations, for varying choices of gamma. (a) Error $_t$ of Learner's model vs iteration t. (b) Unfairness gamma_t of subgroup found by Auditor vs. iteration t, as measured by Definition~\ref{fp-fair}.

Subgroup Fairness Challenge

This paper proposes associating fairness constraints with exponentially many structured subgroups defined by a class of functions instead of solely pre-defined demographic groups (e.g., based on race or gender). The work introduces a two-player zero-sum game framework between a Learner and an Auditor to find optimal subgroup-fair classifiers for large classes of decision rules, with the aim of enforcing fairness for a broader range of subgroups.

The computational problem of auditing a classifier for subgroup fairness is proven equivalent to weak agnostic learning for the family of group indicator functions, indicating worst-case computational hardness for simple natural classes such as boolean conjunctions and linear threshold functions. However, the reduction implies that practical heuristics employed in efficient machine learning can be used to solve the problem in practice, which presents opportunities for implementing approximate fair classifiers that are effective in real datasets.

Algorithm Development

The paper develops two algorithms for achieving subgroup fairness:

Follow the Perturbed Leader (FTPL):
- Uses a strategy where a Learner plays the no-regret FTPL algorithm.
- The Auditor plays best response against the mixed strategy of the Learner.
- Theoretical guarantees are staged for convergence within a polynomial number of iterations, assuming the existence of CSC oracles.
Fictitious Play Algorithm:
- Solves Fair ERM problem using iterative best responses by Learner and Auditor where each player best-responds to the empirical distribution of play over previous rounds.
- More practical due to simplicity and faster convergence in practice.

Empirical Evaluation

The practical application of these algorithms is demonstrated using the Communities and Crime dataset (Figure 2).

Figure 2: (a) Pareto-optimal error-unfairness values, color-coded by varying values of the input parameter gamma. (b) Aggregate Pareto frontier across all values of gamma. The gamma values cover a dense range for smoothing.

Empirical evaluation using Fictitious Play indicated effective computational performance on real datasets. The results suggested a convergence to a collection of randomized classifiers achieving a Pareto frontier of error-unfairness trade-offs (Figure 2). The paper postulates frequent occurrences in datasets of classifiers that pass one-dimensional statistical fairness tests while failing against richer multi-dimensional subgroup tests.

Figure 2: (a) Pareto-optimal error-unfairness values, color-coded by varying values of the input parameter gamma. (b) Aggregate Pareto frontier across all values of gamma. Here the gamma values cover the same range but are sampled more densely to get a smoother frontier. See text for details.

Conclusion

The study outlines a rigorous methodology to confront fairness gerrymandering in machine learning algorithms. By presenting subgroup fairness as a two-player zero-sum game between a Learner and an Auditor, the research not only highlights theoretical and computational challenges but also suggests pragmatic methods for achieving subgroup fairness. The effectiveness of these methods in real datasets promises significant applicability in various domains, pushing toward more equitable machine learning applications.

Markdown Report Issue