Enhancing Trade-offs in Privacy, Utility, and Computational Efficiency through MUltistage Sampling Technique (MUST)
Abstract: Applying a randomized algorithm to a subset rather than the entire dataset amplifies privacy guarantees. We propose a class of subsampling methods ``MUltistage Sampling Technique (MUST)'' for privacy amplification (PA) in the context of differential privacy (DP). We conduct comprehensive analyses of the PA effects and utility for several 2-stage MUST procedures through newly introduced concept including strong vs weak PA effects and aligned privacy profile. We provide the privacy loss composition analysis over repeated applications of MUST via the Fourier accountant algorithm. Our theoretical and empirical results suggest that MUST offers stronger PA in $\epsilon$ than the common one-stage sampling procedures including Poisson sampling, sampling without replacement, and sampling with replacement, while the results on $\delta$ vary case by case. Our experiments show that MUST is non-inferior in the utility and stability of privacy-preserving (PP) outputs to one-stage subsampling methods at similar privacy loss while enhancing the computational efficiency of algorithms that require complex function calculations on distinct data points. MUST can be seamlessly integrated into stochastic optimization algorithms or procedures that involve parallel or simultaneous subsampling when DP guarantees are necessary.
- Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318.
- Large-scale differentially private bert. arXiv preprint arXiv:2108.01624.
- Privacy amplification by subsampling: Tight analyses via couplings and divergences. Advances in Neural Information Processing Systems, 31.
- Improving the Gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. In International Conference on Machine Learning, pages 394–403. PMLR.
- Beyond differential privacy: Composition theorems and relational logic for f-divergences between probabilistic programs. In Automata, Languages, and Programming: 40th International Colloquium, ICALP 2013, Riga, Latvia, July 8-12, 2013, Proceedings, Part II 40, pages 49–60. Springer.
- Adult. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5XW20.
- Resampling fewer than n observations: gains, losses, and remedies for losses, volume 1. JSTOR.
- Resampling fewer than n observations: gains, losses, and remedies for losses. Springer.
- On the choice of m in the m out of n bootstrap and confidence bounds for extrema. Statistica Sinica, pages 967–985.
- Breiman, L. (1996). Bagging predictors. Machine learning, 24:123–140.
- Deep learning with gaussian differential privacy. Harvard data science review, 2020(23):10–1162.
- Composable and versatile privacy via truncated cdp. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 74–86.
- Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Theory of Cryptography: 14th International Conference, TCC 2016-B, Beijing, China, October 31-November 3, 2016, Proceedings, Part I, pages 635–658. Springer.
- An algorithm for the machine calculation of complex fourier series. Mathematics of computation, 19(90):297–301.
- Gaussian differential privacy. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(1):3–37.
- Our data, ourselves: Privacy via distributed noise generation. In Annual international conference on the theory and applications of cryptographic techniques, pages 486–503. Springer.
- Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pages 265–284. Springer.
- The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4):211–407.
- Concentrated differential privacy. arXiv preprint arXiv:1603.01887.
- Boosting and differential privacy. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 51–60. IEEE.
- What can we learn privately? SIAM Journal on Computing, 40(3):793–826.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
- A scalable bootstrap for massive data. Journal of the Royal Statistical Society: Series B: Statistical Methodology, pages 795–816.
- Computing tight differential privacy guarantees using FFT. In International Conference on Artificial Intelligence and Statistics, pages 2560–2569. PMLR.
- Tight differential privacy for discrete-valued mechanisms and for the subsampled Gaussian mechanism using FFT. In International Conference on Artificial Intelligence and Statistics, pages 3358–3366. PMLR.
- Lee, S. M. (1999). On a class of m out of n bootstrap confidence intervals. Journal of the Royal Statistical Society Series B: Statistical Methodology, 61(4):901–911.
- On sampling, anonymization, and differential privacy or, k-anonymization meets differential privacy. In Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security, pages 32–33.
- Liu, F. (2018). Generalized Gaussian mechanism for differential privacy. IEEE Transactions on Knowledge and Data Engineering, 31(4):747–756.
- Mechanism design via differential privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07), pages 94–103. IEEE.
- Mironov, I. (2017). Rényi differential privacy. In 2017 IEEE 30th computer security foundations symposium (CSF), pages 263–275. IEEE.
- Smooth sensitivity and sampling in private data analysis. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 75–84.
- Large sample confidence regions based on subsamples under minimal assumptions. The Annals of Statistics, pages 2031–2050.
- Subsampling. Springer Science & Business Media.
- Smith, A. (2009). Differential privacy and the secrecy of the sample. https://adamdsmith.wordpress.com/2009/09/02/sample-secrecy/.
- Smith, A. (2011). Privacy-preserving statistical estimation with optimal convergence rates. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 813–822.
- Privacy loss classes: The central limit theorem in differential privacy. Cryptology ePrint Archive.
- Vadhan, S. (2017). The complexity of differential privacy. Tutorials on the Foundations of Cryptography: Dedicated to Oded Goldreich, pages 347–450.
- Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11):1134–1142.
- Analytical composition of differential privacy via the edgeworth accountant. arXiv preprint arXiv:2206.04236.
- Subsampled rényi differential privacy and analytical moments accountant. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1226–1235. PMLR.
- Differentially private bootstrap: New privacy analysis and inference strategies. arXiv preprint arXiv:2210.06140.
- Poission subsampled rényi differential privacy. In International Conference on Machine Learning, pages 7634–7642. PMLR.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.