Probing the Transition to Dataset-Level Privacy in ML Models Using an Output-Specific and Data-Resolved Privacy Profile
Abstract: Differential privacy (DP) is the prevailing technique for protecting user data in machine learning models. However, deficits to this framework include a lack of clarity for selecting the privacy budget $\epsilon$ and a lack of quantification for the privacy leakage for a particular data row by a particular trained model. We make progress toward these limitations and a new perspective by which to visualize DP results by studying a privacy metric that quantifies the extent to which a model trained on a dataset using a DP mechanism is ``covered" by each of the distributions resulting from training on neighboring datasets. We connect this coverage metric to what has been established in the literature and use it to rank the privacy of individual samples from the training set in what we call a privacy profile. We additionally show that the privacy profile can be used to probe an observed transition to indistinguishability that takes place in the neighboring distributions as $\epsilon$ decreases, which we suggest is a tool that can enable the selection of $\epsilon$ by the ML practitioner wishing to make use of DP.
- Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. 308–318.
- Differential privacy has disparate impact on model accuracy. Advances in Neural Information Processing Systems 32 (2019), 15479–15488.
- DP-Sniper: Black-Box Discovery of Differential Privacy Violations using Classifiers. (2021), 391–409. https://doi.org/10.1109/SP40001.2021.00081
- Machine Unlearning. In IEEE Symposium of Security and Privacy. arXiv: 1912.03817v3 Citation Key: Bourtoule.
- Jonathan Brophy and Daniel Lowd. 2021. Machine Unlearning for Random Forests. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 1092–1104. https://proceedings.mlr.press/v139/brophy21a.html Citation Key: pmlr-v139-brophy21a.
- Differentially private empirical risk minimization. Journal of Machine Learning Research 12, 3 (2011).
- Paul Cuff and Lanqing Yu. 2016. Differential privacy as a mutual information constraint. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 43–54.
- Detecting Violations of Differential Privacy. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM, 475–489.
- The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 3-4 (2014), 211–407.
- Cynthia Dwork and Guy N Rothblum. 2016. Concentrated differential privacy. arXiv preprint arXiv:1603.01887 (2016).
- That which we call private. arXiv preprint arXiv:1908.03566 (2019).
- Adversarial classification under differential privacy. In Network and Distributed Systems Security (NDSS) Symposium 2020.
- Robustness, Privacy, and Generalization of Adversarial Training. arXiv preprint arXiv:2012.13573 (2020).
- Diffprivlib: the IBM differential privacy library. arXiv preprint arXiv:1907.02444 (2019).
- Differential Privacy: An Economic Method for Choosing Epsilon. CoRR abs/1402.3329 (2014). arXiv:1402.3329 http://arxiv.org/abs/1402.3329
- Differentially Private Learning Does Not Bound Membership Inference. CoRR abs/2010.12112 (2020). arXiv:2010.12112 https://arxiv.org/abs/2010.12112
- Approximate Data Deletion from Machine Learning Models. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 130), Arindam Banerjee and Kenji Fukumizu (Eds.). PMLR, 2008–2016. http://proceedings.mlr.press/v130/izzo21a.html Citation Key: pmlr-v130-izzo21a.
- Sparse Private LASSO Logistic Regression. ArXiv abs/2304.12429 (2023).
- Nitin Kohli and Paul Laskowski. 2018. Epsilon Voting: Mechanism Design for Parameter Selection in Differential Privacy. In 2018 IEEE Symposium on Privacy-Aware Computing (PAC). 19–30. https://doi.org/10.1109/PAC.2018.00009
- A General Framework for Auditing Differentially Private Machine Learning. NeurIPS (2022).
- Sasi Kumar Murakonda and Reza Shokri. 2020. ML Privacy Meter: Aiding regulatory compliance by quantifying the privacy risks of machine learning. arXiv preprint arXiv:2007.09339 (2020).
- Adversarial Robustness Toolbox v1.2.0. CoRR 1807.01069 (2018). https://arxiv.org/pdf/1807.01069
- Quick sensitivity analysis for incremental data modification and its application to leave-one-out CV in linear classification problems. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 885–894.
- Membership Inference Attack against Differentially Private Deep Learning Model. Trans. Data Priv. 11, 1 (2018), 61–79.
- Rachel Redberg and Yu-Xiang Wang. 2021. Privately Publishable Per-instance Privacy. Advances in Neural Information Processing Systems 34 (2021).
- Incremental Sensitivity Analysis for Kernelized Models. In ECML/PKDD, Frank Hutter, Kristian Kersting, Jefrey Lijffijt, and Isabel Valera (Eds.). Springer International Publishing, Cham, 383–398. Citation Key: Sivan2021.
- Individual differential privacy: A utility-preserving formulation of differential privacy guarantees. IEEE Transactions on Information Forensics and Security 12, 6 (2017), 1418–1429.
- Sok: Differential privacy as a causal property. In 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 354–371.
- RoD: Evaluating the risk of data disclosure using noise estimation for differential privacy. IEEE Transactions on Big Data (2019).
- Yu-Xiang Wang. 2019. Per-instance differential privacy. Journal of Privacy and Confidentiality 9, 1 (2019).
- Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning. CoRR abs/1906.00389 (2019). arXiv:1906.00389 http://arxiv.org/abs/1906.00389
- Multiple privacy regimes mechanism for local differential privacy. In International Conference on Database Systems for Advanced Applications. Springer, 247–263.
- Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st Computer Security Foundations Symposium (CSF). IEEE, 268–282.
- Tighter Bound Estimation of Sensitivity Analysis for Incremental and Decremental Data Modification. arXiv preprint arXiv:2003.03351 (2020).
- Tighter Bound Estimation of Sensitivity Analysis for Incremental and Decremental Data Modification. (2021). arXiv: 2003.03351 Citation Key: Zhou2021.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.