On the limitation of evaluating machine unlearning using only a single training seed

Published: Oct 30, 2025

Last Updated: Oct 30, 2025

Authors:Jamie Lanyon, Axel Finke, Petros Andreou, Georgina Cosma

Abstract

Machine unlearning (MU) aims to remove the influence of certain data points from a trained model without costly retraining. Most practical MU algorithms are only approximate and their performance can only be assessed empirically. Care must therefore be taken to make empirical comparisons as representative as possible. A common practice is to run the MU algorithm multiple times independently starting from the same trained model. In this work, we demonstrate that this practice can give highly non-representative results because -- even for the same architecture and same dataset -- some MU methods can be highly sensitive to the choice of random number seed used for model training. We therefore recommend that empirical comphttps://info.arxiv.org/help/prep#commentsarisons of MU algorithms should also reflect the variability across different model training seeds.

On the limitation of evaluating machine unlearning using only a single training seed

Abstract

Categories