Zusammenfassung
The most prevalent notions of fairness in machine learning are statistical
definitions: they fix a small collection of pre-defined groups, and then ask
for parity of some statistic of the classifier across these groups. Constraints
of this form are susceptible to intentional or inadvertent "fairness
gerrymandering", in which a classifier appears to be fair on each individual
group, but badly violates the fairness constraint on one or more structured
subgroups defined over the protected attributes. We propose instead to demand
statistical notions of fairness across exponentially (or infinitely) many
subgroups, defined by a structured class of functions over the protected
attributes. This interpolates between statistical definitions of fairness and
recently proposed individual notions of fairness, but raises several
computational challenges. It is no longer clear how to audit a fixed classifier
to see if it satisfies such a strong definition of fairness. We prove that the
computational problem of auditing subgroup fairness for both equality of false
positive rates and statistical parity is equivalent to the problem of weak
agnostic learning, which means it is computationally hard in the worst case,
even for simple structured subclasses.
We then derive two algorithms that provably converge to the best fair
classifier, given access to oracles which can solve the agnostic learning
problem. The algorithms are based on a formulation of subgroup fairness as a
two-player zero-sum game between a Learner and an Auditor. Our first algorithm
provably converges in a polynomial number of steps. Our second algorithm enjoys
only provably asymptotic convergence, but has the merit of simplicity and
faster per-step computation. We implement the simpler algorithm using linear
regression as a heuristic oracle, and show that we can effectively both audit
and learn fair classifiers on real datasets.
Nutzer