Authors: Ioannis Caragiannis, Evi Micha, Nisarg Shah
Abstract: We revisit the recently developed framework of proportionally fair
clustering, where the goal is to provide group fairness guarantees that become
stronger for groups of data points (agents) that are large and cohesive. Prior
work applies this framework to centroid clustering, where the loss of an agent
is its distance to the centroid assigned to its cluster. We expand the
framework to non-centroid clustering, where the loss of an agent is a function
of the other agents in its cluster, by adapting two proportional fairness
criteria — the core and its relaxation, fully justified representation (FJR)
— to this setting.
We show that the core can be approximated only under structured loss
functions, and even then, the best approximation we are able to establish,
using an adaptation of the GreedyCapture algorithm developed for centroid
clustering [Chen et al., 2019; Micha and Shah, 2020], is unappealing for a
natural loss function. In contrast, we design a new (inefficient) algorithm,
GreedyCohesiveClustering, which achieves the relaxation FJR exactly under
arbitrary loss functions, and show that the efficient GreedyCapture algorithm
achieves a constant approximation of FJR. We also design an efficient auditing
algorithm, which estimates the FJR approximation of any given clustering
solution up to a constant factor. Our experiments on real data suggest that
traditional clustering algorithms are highly unfair, whereas GreedyCapture is
considerably fairer and incurs only a modest loss in common clustering
objectives.
Source: http://arxiv.org/abs/2410.23273v1