Redefining Machine Unlearning: A Conformal Prediction-Motivated Approach

Authors: Yingdan Shi, Ren Wang

Abstract: Machine unlearning seeks to systematically remove specified data from a
trained model, effectively achieving a state as though the data had never been
encountered during training. While metrics such as Unlearning Accuracy (UA) and
Membership Inference Attack (MIA) provide a baseline for assessing unlearning
performance, they fall short of evaluating the completeness and reliability of
forgetting. This is because the ground truth labels remain potential candidates
within the scope of uncertainty quantification, leaving gaps in the evaluation
of true forgetting. In this paper, we identify critical limitations in existing
unlearning metrics and propose enhanced evaluation metrics inspired by
conformal prediction. Our metrics can effectively capture the extent to which
ground truth labels are excluded from the prediction set. Furthermore, we
observe that many existing machine unlearning methods do not achieve
satisfactory forgetting performance when evaluated with our new metrics. To
address this, we propose an unlearning framework that integrates conformal
prediction insights into Carlini & Wagner adversarial attack loss. Extensive
experiments on the image classification task demonstrate that our enhanced
metrics offer deeper insights into unlearning effectiveness, and that our
unlearning framework significantly improves the forgetting quality of
unlearning methods.

Source: http://arxiv.org/abs/2501.19403v1

About the Author

Leave a Reply

Your email address will not be published. Required fields are marked *

You may also like these