MAGIC: Near-Optimal Data Attribution for Deep Learning

Published 23 Apr 2025 in cs.LG, cs.CL, cs.CV, and stat.ML | (2504.16430v1)

Abstract: The goal of predictive data attribution is to estimate how adding or removing a given set of training datapoints will affect model predictions. In convex settings, this goal is straightforward (i.e., via the infinitesimal jackknife). In large-scale (non-convex) settings, however, existing methods are far less successful -- current methods' estimates often only weakly correlate with ground truth. In this work, we present a new data attribution method (MAGIC) that combines classical methods and recent advances in metadifferentiation to (nearly) optimally estimate the effect of adding or removing training data on model predictions.

Abstract PDF Upgrade to Chat

Summary

The paper introduces MAGIC as a novel approach to data attribution, accurately predicting model changes with near-optimal performance using metagradients.
The methodology uses the Replay algorithm to compute exact metagradients in non-convex settings, overcoming limitations of traditional influence functions.
Evaluations in vision and language tasks demonstrate MAGIC's robust performance, providing strong correlations with true model behavior.

"MAGIC: Near-Optimal Data Attribution for Deep Learning" (2504.16430)

Introduction

The paper "MAGIC: Near-Optimal Data Attribution for Deep Learning" addresses the complex issue of predictive data attribution in large-scale non-convex settings, which is a crucial problem in machine learning. Predictive data attribution involves estimating how the addition or removal of training datapoints affects model predictions. While this problem is relatively straightforward in convex settings through techniques like the infinitesimal jackknife, it becomes challenging in large-scale, non-convex environments such as those encountered in deep learning. Existing methods in such settings tend to offer estimates that only weakly correlate with the ground truth. This paper introduces a novel data attribution method, MAGIC, which combines classical methods with recent advancements in metadifferentiation to provide near-optimal estimates of the effects of training data on model predictions.

Contributions

The authors of the paper present two key contributions. First, they propose a new perspective on data attribution by introducing the "single-model" data attribution setting. Unlike traditional methods that predict how a learning algorithm would behave on average if trained on a different dataset, the single-model data attribution approach focuses on predicting how a specific model would behave if trained on different data, even in the presence of the inherent randomness of large-scale training processes. This allows for perfectly predicting model behavior as a function of the training data.

Second, the paper introduces a new data attribution method named MAGIC (Metagradient-based Attribution via Ground-truth Influence Computation). MAGIC leverages recent advances in metagradient computation to precisely calculate the influence function for large-scale learning algorithms. This method accurately estimates how model predictions respond to changes in the training dataset, achieving significantly better performance than existing methods. For instance, MAGIC can almost perfectly predict the loss of a model when random subsets of training data are removed, with a high Spearman correlation with the ground truth.

Methodology

The MAGIC method is built upon the concept of influence functions, which approximate the effect of training data changes using a first-order Taylor expansion. The key challenge in non-convex settings is accurately computing the gradient through the training process, which is addressed by employing the Replay algorithm to calculate the exact metagradient for iterative and smooth learning algorithms. By explicitly computing these metagradients, MAGIC provides accurate predictions of how a model's output changes in response to training data modifications.

This approach is applied to large-scale non-convex models, such as deep neural networks, where the influence function is difficult to compute due to the lack of a closed-form solution. The Replay algorithm, therefore, exploits the iterative nature of most learning algorithms, integrating the influence of data weights iteratively across the training process.

Evaluation

The efficacy of MAGIC is evaluated across various domains, including computer vision and language modeling. The evaluation demonstrates that MAGIC consistently provides near-perfect predictions of model output changes with the Linear Datamodeling Score (LDS) metric. The method significantly outperforms existing techniques, which often exhibit weak correlations with true model losses. Notably, MAGIC maintains high accuracy even when small subsets of training data are removed, illustrating its robustness.

Discussion and Implications

The paper explores the implications of the single-model data attribution setting in comparison to standard predictive data attribution. It highlights that while single-model attribution offers a more precise prediction of how specific models would behave under different training data, it is computationally more intense. The researchers discuss the trade-offs between computational complexity and prediction accuracy, noting that MAGIC becomes computationally expensive as the number of test samples increases.

Conclusion

In conclusion, the paper proposes MAGIC as a novel and effective data attribution method that advances the current state of understanding in model prediction dynamics in large-scale non-convex settings. By accurately computing the exact influence of training data, MAGIC provides a robust tool for model analysis and can potentially enhance tasks such as model unlearning and debugging. The method stands out for its precision and potential applications in improving model transparency and accountability.

Markdown Report Issue