Papers
Topics
Authors
Recent
Search
2000 character limit reached

On the Validation of Gibbs Algorithms: Training Datasets, Test Datasets and their Aggregation

Published 21 Jun 2023 in cs.LG, cs.IT, math.IT, math.PR, math.ST, and stat.TH | (2306.12380v1)

Abstract: The dependence on training data of the Gibbs algorithm (GA) is analytically characterized. By adopting the expected empirical risk as the performance metric, the sensitivity of the GA is obtained in closed form. In this case, sensitivity is the performance difference with respect to an arbitrary alternative algorithm. This description enables the development of explicit expressions involving the training errors and test errors of GAs trained with different datasets. Using these tools, dataset aggregation is studied and different figures of merit to evaluate the generalization capabilities of GAs are introduced. For particular sizes of such datasets and parameters of the GAs, a connection between Jeffrey's divergence, training and test errors is established.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. S.Ā M. Perlaza, G.Ā Bisson, I.Ā Esnaola, A.Ā Jean-Marie, and S.Ā Rini, ā€œEmpirical risk minimization with generalized relative entropy regularization,ā€ INRIA, Centre Inria d’UniversitĆ© CĆ“te d’Azur, Sophia Antipolis, France, Tech. Rep. RR-9454, Feb. 2022.
  2. L.Ā ZdeborovĆ” and F.Ā Krzakala, ā€œStatistical physics of inference: Thresholds and algorithms,ā€ Advances in Physics, vol.Ā 65, no.Ā 5, pp. 453–552, Aug. 2016.
  3. P.Ā Alquier, J.Ā Ridgway, and N.Ā Chopin, ā€œOn the properties of variational approximations of Gibbs posteriors,ā€ Journal of Machine Learning Research, vol.Ā 17, no.Ā 1, pp. 8374–8414, Dec. 2016.
  4. G.Ā Aminian, Y.Ā Bu, L.Ā Toni, M.Ā Rodrigues, and G.Ā Wornell, ā€œAn exact characterization of the generalization error for the Gibbs algorithm,ā€ Advances in Neural information Processing Systems, vol.Ā 34, pp. 8106–8118, Dec. 2021.
  5. T.Ā Zhang, ā€œFrom ϵitalic-ϵ\epsilonitalic_ϵ-entropy to KL-entropy: Analysis of minimum information complexity density estimation,ā€ The Annals of Statistics, vol.Ā 34, no.Ā 5, pp. 2180–2210, 2006.
  6. ——, ā€œInformation-theoretic upper and lower bounds for statistical estimation,ā€ IEEE Transactions on Information Theory, vol.Ā 52, no.Ā 4, pp. 1307–1321, Apr. 2006.
  7. J.Ā Jiao, Y.Ā Han, and T.Ā Weissman, ā€œDependence measures bounding the exploration bias for general measurements,ā€ in Proceedings of the IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, Jun. 2017, pp. 1475–1479.
  8. A.Ā Xu and M.Ā Raginsky, ā€œInformation-theoretic analysis of generalization capability of learning algorithms,ā€ Advances in Neural information Processing Systems, Dec. 2017.
  9. H.Ā Wang, M.Ā Diaz, J.Ā C.Ā S. SantosĀ Filho, and F.Ā P. Calmon, ā€œAn information-theoretic view of generalization via Wasserstein distance,ā€ in Proceedings of the IEEE International Symposium on Information Theory (ISIT), Paris, France, Jul. 2019, pp. 577–581.
  10. I.Ā Issa, A.Ā R. Esposito, and M.Ā Gastpar, ā€œStrengthened information-theoretic bounds on the generalization error,ā€ in Proceedings of the IEEE International Symposium on Information Theory (ISIT), Paris, France, Jul. 2019, pp. 582–586.
  11. D.Ā Russo and J.Ā Zou, ā€œHow much does your data exploration overfit? Controlling bias via information usage,ā€ IEEE Transactions on Information Theory, vol.Ā 66, no.Ā 1, pp. 302–323, Jan. 2019.
  12. Y.Ā Bu, S.Ā Zou, and V.Ā V. Veeravalli, ā€œTightening mutual information-based bounds on generalization error,ā€ IEEE Journal on Selected Areas in Information Theory, vol.Ā 1, no.Ā 1, pp. 121–130, 2020.
  13. A.Ā Asadi, E.Ā Abbe, and S.Ā VerdĆŗ, ā€œChaining mutual information and tightening generalization bounds,ā€ Advances in Neural information Processing Systems, pp. 7245–7254, Dec. 2018.
  14. A.Ā T. Lopez and V.Ā Jog, ā€œGeneralization error bounds using Wasserstein distances,ā€ in Proceedings of the IEEE Information Theory Workshop (ITW), Guangzhou, China, Nov. 2018, pp. 1–5.
  15. A.Ā R. Asadi and E.Ā Abbe, ā€œChaining meets chain rule: Multilevel entropic regularization and training of neural networks.ā€ Journal of Machine Learning Research, vol.Ā 21, pp. 139–1, 2020.
  16. H.Ā Hafez-Kolahi, Z.Ā Golgooni, S.Ā Kasaei, and M.Ā Soleymani, ā€œConditioning and processing: Techniques to improve information-theoretic generalization bounds,ā€ Advances in Neural information Processing Systems, pp. 16 457–16 467, Dec. 2020.
  17. M.Ā Haghifam, J.Ā Negrea, A.Ā Khisti, D.Ā M. Roy, and G.Ā K. Dziugaite, ā€œSharpened generalization bounds based on conditional mutual information and an application to noisy, iterative algorithms,ā€ Advances in Neural information Processing Systems, pp. 9925–9935, Dec. 2018.
  18. B.Ā RodrĆ­guezĀ GĆ”lvez, G.Ā Bassi, R.Ā Thobaben, and M.Ā Skoglund, ā€œTighter expected generalization error bounds via Wasserstein distance,ā€ Advances in Neural information Processing Systems, pp. 19 109–19 121, Dec. 2021.
  19. A.Ā R. Esposito, M.Ā Gastpar, and I.Ā Issa, ā€œGeneralization error bounds via RĆ©nyi-, f-divergences and maximal leakage,ā€ IEEE Transactions on Information Theory, vol.Ā 67, no.Ā 8, pp. 4986–5004, 2021.
  20. G.Ā Aminian, L.Ā Toni, and M.Ā R. Rodrigues, ā€œJensen-Shannon information based characterization of the generalization error of learning algorithms,ā€ in Proceedings of the IEEE Information Theory Workshop (ITW), Kanazawa, Japan, Oct. 2021, pp. 1–5.
  21. G.Ā Aminian, Y.Ā Bu, L.Ā Toni, M.Ā R.Ā D. Rodrigues, and G.Ā W. Wornell, ā€œInformation-theoretic characterizations of generalization error for the Gibbs algorithm,ā€ ArXiv Preprint 2210.09864, 2022.
  22. G.Ā Aminian, Y.Ā Bu, G.Ā W. Wornell, and M.Ā R. Rodrigues, ā€œTighter expected generalization error bounds via convexity of information measures,ā€ in Proceedings of the IEEE International Symposium on Information Theory (ISIT), Aalto, Finland, Jun. 2022, pp. 2481–2486.
  23. J.Ā Shawe-Taylor and R.Ā C. Williamson, ā€œA PAC analysis of a Bayesian estimator,ā€ in Proceedings of Tenth Annual Conference on Computational Learning Theory, July 1997, pp. 2–9.
  24. D.Ā A. McAllester, ā€œPAC-Bayesian stochastic model selection,ā€ Machine Learning, vol.Ā 51, no.Ā 1, pp. 5–21, 2003.
  25. M.Ā Haddouche, B.Ā Guedj, O.Ā Rivasplata, and J.Ā Shawe-Taylor, ā€œPAC-Bayes unleashed: Generalisation bounds with unbounded losses,ā€ Entropy, vol.Ā 23, no.Ā 10, Oct. 2021.
  26. B.Ā Guedj and L.Ā Pujol, ā€œStill no free lunches: The price to pay for tighter PAC-Bayes bounds,ā€ Entropy, vol.Ā 23, no.Ā 11, Nov. 2021.
  27. S.Ā M. Perlaza, G.Ā Bisson, I.Ā Esnaola, A.Ā Jean-Marie, and S.Ā Rini, ā€œEmpirical risk minimization with relative entropy regularization: Optimality and sensitivity,ā€ in Proceedings of the IEEE International Symposium on Information Theory (ISIT), Espoo, Finland, Jul. 2022.
  28. S.Ā M. Perlaza, I.Ā Esnaola, G.Ā Bisson, and H.Ā V. Poor, ā€œSensitivity of the Gibbs algorithm to data aggregation in supervised machine learning,ā€ INRIA, Centre Inria d’UniversitĆ© CĆ“te d’Azur, Sophia Antipolis, France, Research Report RR-9474, Jun. 2022.
  29. B.Ā McMahan, E.Ā Moore, D.Ā Ramage, S.Ā Hampson, and B.Ā AgüeraĀ y Arcas, ā€œCommunication-efficient learning of deep networks from decentralized data,ā€ in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), Fort Lauderdale, Florida, Apr. 2017, pp. 1273–1282.
  30. F.Ā Daunas, I.Ā Esnaola, S.Ā M. Perlaza, and H.Ā V. Poor, ā€œAnalysis of the relative entropy asymmetry in the regularization of empirical risk minimization,ā€ in Proceedings of the IEEE International Symposium on Information Theory (ISIT), Taipei, Taiwan, Jun. 2023.
  31. H.Ā Jeffreys, ā€œAn invariant form for the prior probability in estimation problems,ā€ Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, vol. 186, no. 1007, pp. 453–461, 1946.
Citations (16)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.