Papers
Topics
Authors
Recent
Search
2000 character limit reached

Kernelized Normalizing Constant Estimation: Bridging Bayesian Quadrature and Bayesian Optimization

Published 11 Jan 2024 in cs.LG and stat.ML | (2401.05716v1)

Abstract: In this paper, we study the problem of estimating the normalizing constant $\int e{-\lambda f(x)}dx$ through queries to the black-box function $f$, where $f$ belongs to a reproducing kernel Hilbert space (RKHS), and $\lambda$ is a problem parameter. We show that to estimate the normalizing constant within a small relative error, the level of difficulty depends on the value of $\lambda$: When $\lambda$ approaches zero, the problem is similar to Bayesian quadrature (BQ), while when $\lambda$ approaches infinity, the problem is similar to Bayesian optimization (BO). More generally, the problem varies between BQ and BO. We find that this pattern holds true even when the function evaluations are noisy, bringing new aspects to this topic. Our findings are supported by both algorithm-independent lower bounds and algorithmic upper bounds, as well as simulation studies conducted on a variety of benchmark functions.

Authors (2)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Bayesian optimization of composite functions. In International Conference on Machine Learning.
  2. Bakhvalov, N. S. 1959. On the approximate calculation of multiple integrals. Vestnik MGU, Ser. Math. Mech. Astron. Phys. Chem, 4: 3–18.
  3. Towards a theory of non-log-concave sampling: first-order stationarity guarantees for langevin monte carlo. In Conference on Learning Theory, 2896–2923. PMLR.
  4. Bingham, D. 2013. Virtual Library of Simulation Experiments: Test Functions and Datasets. https://www.sfu.ca/~ssurjano/index.html. Accessed: 2023-08-05.
  5. Concentration inequalities. In Summer school on machine learning, 208–240. Springer.
  6. Normalizing constants of log-concave densities. Electronic Journal of Statistics, 12(1): 851 – 889.
  7. Bull, A. D. 2011. Convergence rates of efficient global optimization algorithms. Journal of Machine Learning Research, 12(10).
  8. On average-case error bounds for kernel-based Bayesian quadrature. Transactions on Machine Learning Research.
  9. On lower bounds for standard and robust Gaussian process bandit optimization. In International Conference on Machine Learning, 1216–1226. PMLR.
  10. On Monte Carlo methods for estimating ratios of normalizing constants. The Annals of Statistics, 25(4): 1563–1594.
  11. Sharp convergence rates for Langevin dynamics in the nonconvex setting. arXiv preprint arXiv:1805.01648.
  12. Fisher information lower bounds for sampling. In International Conference on Algorithmic Learning Theory, 375–410. PMLR.
  13. On kernelized multi-armed bandits. In International Conference on Machine Learning, 844–853. PMLR.
  14. On tracking the partition function. Advances in Neural Information Processing Systems, 24.
  15. High-dimensional Bayesian inference via the unadjusted Langevin algorithm. Bernoulli.
  16. Log-concave sampling: Metropolis-Hastings algorithms are fast! In Conference on learning theory, 793–797. PMLR.
  17. Poisson intensity estimation with reproducing kernels. In International Conference on Artificial Intelligence and Statistics, 270–279. PMLR.
  18. Estimating normalizing constants for log-concave distributions: Algorithms and lower bounds. In ACM SIGACT Symposium on Theory of Computing, 579–586.
  19. Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statistical science, 163–185.
  20. Convergence rates for non-log-concave sampling and log-partition estimation. arXiv preprint arXiv:2303.03237.
  21. Convergence guarantees for adaptive Bayesian quadrature methods. Advances in Neural Information Processing Systems, 32.
  22. Gaussian processes and kernel methods: A review on connections and equivalences. arXiv preprint arXiv:1807.02582.
  23. Fast algorithms for logconcave functions: Sampling, rounding, integration and optimization. In IEEE Symposium on Foundations of Computer Science (FOCS), 57–68. IEEE.
  24. Sampling can be faster than optimization. Proceedings of the National Academy of Sciences, 116(42): 20881–20885.
  25. Non-parametric models for non-negative functions. Advances in Neural Information Processing Systems, 33: 12816–12826.
  26. Invariant priors for Bayesian quadrature. arXiv preprint arXiv:2112.01578.
  27. Novak, E. 2006. Deterministic and stochastic error bounds in numerical analysis, volume 1349. Springer.
  28. Control functionals for Monte Carlo integration. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 695–718.
  29. Plaskota, L. 1996. Worst case complexity of problems with random information noise. Journal of Complexity, 12(4): 416–439.
  30. Gaussian processes for machine learning. Adaptive computation and machine learning. MIT Press.
  31. Ripley, B. D. 2009. Stochastic simulation. John Wiley & Sons.
  32. Lower bounds on regret for noisy Gaussian process bandit optimization. In Conference on Learning Theory, 1723–1742. PMLR.
  33. Gaussian process optimization in the bandit setting: no regret and experimental design. In International Conference on Machine Learning.
  34. Free energy computations: A mathematical perspective. World Scientific.
  35. On the theory of the Brownian motion. Physical Review, 36(5): 823.
  36. Optimal order simple regret for Gaussian process bandits. Advances in Neural Information Processing Systems, 34: 21202–21215.
  37. On information gain and regret bounds in Gaussian process bandits. In International Conference on Artificial Intelligence and Statistics, 82–90. PMLR.
  38. Approximate interpolation with applications to selecting smoothing parameters. Numerische Mathematik, 101(4): 729–748.
  39. Convergence guarantees for Gaussian process means with misspecified likelihoods and smoothness. Journal of Machine Learning Research, 22.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 2 likes about this paper.