Differentially Private Learning Needs Better Features (or Much More Data)

Published 23 Nov 2020 in cs.LG, cs.CR, and stat.ML | (2011.11660v3)

Abstract: We demonstrate that differentially private machine learning has not yet reached its "AlexNet moment" on many canonical vision tasks: linear models trained on handcrafted features significantly outperform end-to-end deep neural networks for moderate privacy budgets. To exceed the performance of handcrafted features, we show that private learning requires either much more private data, or access to features learned on public data from a similar domain. Our work introduces simple yet strong baselines for differentially private learning that can inform the evaluation of future progress in this area.

Abstract PDF Upgrade to Chat

Citations (243)

View on Semantic Scholar

Summary

The paper shows that handcrafted features outperform deep learning models under differential privacy constraints, achieving higher benchmark accuracies.
The study reveals that with moderate privacy budgets, advanced feature extraction mitigates the performance loss from noise in DP-SGD.
The authors highlight that significantly larger or augmented datasets are essential to bridge the gap with non-private learning methods.

Differentially Private Learning Needs Better Features or Much More Data

This paper, authored by Florian Tramer and Dan Boneh, presents an insightful exploration into the current limitations of differentially private machine learning, particularly on canonical vision tasks. The central thesis posits that differentially private learning has not yet achieved performance parity with non-private methods—an outcome the paper metaphorically refers to as the "AlexNet moment," after the deep learning breakthrough on ImageNet classification. The findings reveal that linear models trained on handcrafted features currently outperform deep learning models when differential privacy constraints are applied, given moderate privacy budgets.

In their analysis, the authors underscore the formidable challenge of training deep neural networks with differential privacy (DP) without compromising utility. To bridge this performance gap for private learning systems, the paper suggests the need for either extensive private datasets or access to public datasets to enhance feature extraction. Through rigorous experimentation, the research establishes that leveraging advanced handcrafted feature extractors like Scattering Networks can significantly enhance the accuracy of private models on datasets like MNIST, Fashion-MNIST, and CIFAR-10.

Key contributions of the study include establishing robust baselines using handcrafted features intended to improve differential privacy's privacy-utility tradeoff. Notably, the research achieves a marked improvement over existing benchmarks, offering models that exceed previous results for given privacy budgets. For instance, their method improves CIFAR-10 accuracy to surpass existing end-to-end privately trained CNNs by significant margins, achieving a level of accuracy with a privacy guarantee ( $\epsilon$ ) of $2.6$ where prior research achieved at $7.53$.

Moreover, the work explores the reasons behind the superior performance of models utilizing handcrafted features versus end-to-end deep learning methods under privacy constraints. The authors dissect the inefficiency of standard DP-SGD methods to adequately mitigate the noise introduced for privacy preservation, especially in higher-dimensional models like deep networks, thereby rendering deep models' competitive use cases limited under small DP budgets. The paper’s empirical observations suggest that the constraint might be surmounted through improved learning rates that facilitate faster convergence, which in turn aids in reducing noise accumulation over sequential updates inherent in DP-SGD.

A particularly engaging aspect of the research is the exploration of “what it takes” for differentially private deep learning to overcome the baseline set by handcrafted features—specifically through data quantity and quality improvements. The authors effectively demonstrate through experiments with augmented datasets and transfer learning that achieving superior performance necessitates substantially larger datasets and involves considerable resource allocation to model pre-training on non-sensitive, publicly available data.

The theoretical and practical implications of these findings are significant. From a theoretical standpoint, this research motivates continued exploration of privacy-preserving mechanisms that do not sacrifice model utility. It suggests potential avenues using transfer learning and data augmentation as strategies for mitigating privacy constraints in real-world deployments. From a practical perspective, differentially private learning's current reliance on abundant data presents a barrier to applications constrained by strict data availability or collection ethics, such as healthcare.

Future developments in the field might focus on engineering more resource-efficient private learning algorithms, potentially integrating novel findings from optimization theory to improve convergence under privacy constraints. The possibility of crafting localized, domain-specific features that could balance the trade-off between learning generalizability and privacy preservation is another promising direction.

Overall, this paper serves as a comprehensive guide, presenting definitive benchmarks, dissecting intrinsic methodological challenges, and introducing potential strategies for advancing the performance envelope of differentially private learning within restrictive settings.