Coarse-to-Fine Bayesian Optimization

Updated 23 January 2026

Coarse-to-fine Bayesian optimization is a framework that separates broad, low-resolution exploration from focused, high-resolution refinement to efficiently minimize expensive black-box functions.
The methodology leverages multi-level surrogate models and adaptive ROI selection, enabling rapid elimination of unpromising regions while transferring informative priors to fine stages.
This approach has demonstrated accelerated convergence and reduced cumulative regret in applications like molecular design and hyperparameter tuning, even under tight evaluation budgets.

A coarse-to-fine Bayesian optimization (BO) framework is a family of methodologies designed to improve the efficiency and efficacy of BO in complex, high-dimensional, or resource-constrained scenarios by explicitly leveraging hierarchical, multi-resolution, or search-space-refining principles. These frameworks operate by exploiting "coarse" models, low-resolution representations, or search-space partitions to rapidly eliminate unpromising regions or extract informative priors, subsequently transitioning to "fine" models, higher-resolution latent spaces, or locally focused BO for precise, data-efficient optimization. This paradigm spans applications from molecular simulation and chemical design to black-box optimization of general functions.

1. Framework Definition and Problem Formulation

Coarse-to-fine Bayesian optimization is characterized by an explicit separation between an initial exploration phase—using approximations, relaxed constraints, or coarse-grained representations—and a refinement phase, where BO is concentrated within promising subdomains or at higher model fidelities. Prominent instantiations include:

Region-of-interest filtering via probabilistic superlevel sets (Zhang et al., 2023);
Multi-level hierarchical models where surrogate GPs are constructed at several coarse-to-fine resolutions, potentially with cross-level transfer (Walter et al., 7 May 2025);
Coarse-grained parameter refinement in simulation-based science, where a coarse model is tuned to emulate fine model observables with minimal loss (Ray et al., 6 Jan 2025);
Search-space division heuristics to reduce the effective dimensionality or volume before applying standard BO (Nomura et al., 2019).

The generic objective is to minimize a black-box function $f(x)$ (or a multi-objective vector $\mathbf{f}(x)$ ) over $x\in\mathcal{X}$ , subject to a hard evaluation budget. For scientific applications, the fine model may correspond to all-atom molecular dynamics (MD), while the coarse model represents a computationally cheaper approximation.

2. Hierarchical Models and Search-Space Decomposition

Coarse-to-fine approaches achieve computational gains by imposing a hierarchy on the modeling or search process:

Multi-resolution representations: In molecular design, discrete chemical spaces are mapped into multiple coarse-grained levels, each compressing atomistic detail (e.g., bead types in Martini models grouped into 15, 45, or 96 types) (Walter et al., 7 May 2025). Each level supports a surrogate model and induces a many-to-one mapping $M_l: \mathcal{X}_l \to \mathcal{X}_{l-1}$ between successively coarser spaces.
Latent-space embedding: Discrete objects at each resolution level are embedded via regularized autoencoders into smooth latent spaces, typically $\mathbb{R}^d$ , where BO can be more effectively conducted (Walter et al., 7 May 2025).
Adaptive ROI selection: In continuous optimization, the dynamic restriction of the feasible space is achieved through adaptive superlevel-set filtering, defining ROIs where the optimum $x^*$ likely resides (Zhang et al., 2023).

This structure allows funneling of computational resources from broad, coarse exploration toward targeted, fine exploitation.

3. Surrogate Modeling, Cross-Level Transfer, and Acquisition

All coarse-to-fine BO frameworks rely on Gaussian process (GP) surrogates to model the (potentially vector-valued) objective:

Multi-output GPs: Multi-objective cases are handled by multi-output GPs using linear models of coregionalization or similar constructions, with ARD squared-exponential kernels commonly used (Ray et al., 6 Jan 2025).
Hierarchical/delta-GPs: At each coarse-to-fine level $l$ , the surrogate is constructed as $f_l(x) = f_{l-1}(M_l(x)) + \delta_l(x)$ , with $\delta_l(x) \sim \text{GP}(0,k_l)$ , effectively learning corrections to lower-level predictions (Walter et al., 7 May 2025).
Parameter learning: Hyperparameters (lengthscales, noise variances) are optimized by marginal likelihood maximization per standard GP regression.

Acquisition is typically managed via expected improvement (EI) for single-objective cases, or multi-objective expected hypervolume improvement (EHVI) when Pareto optimality is sought. Acquisition maximization is confined to the region of interest or neighborhoods projected from lower-resolution optima (Zhang et al., 2023, Walter et al., 7 May 2025).

4. Optimization Loop and Workflow

A typical coarse-to-fine BO workflow follows a structured sequence:

Initialization: Construct coarse representations, train embeddings or surrogates with initial designs (e.g., Latin hypercube or k-medoids selection) (Walter et al., 7 May 2025, Ray et al., 6 Jan 2025).
Coarse Exploration:
- Evaluate the objective or a prior at the coarsest level;
- Identify promising candidate neighborhoods via the GP (ROI definition: $R_n$ ) (Zhang et al., 2023);
- In heuristic partitioning, iteratively split the search space along each dimension, allocating a fixed fraction $\gamma$ of the total budget (Nomura et al., 2019).
Transition Criteria: Ascend levels when uncertainty in predictions at the current level falls below a threshold for several consecutive steps; descend if the candidate is outside reliable regions (Walter et al., 7 May 2025).
Fine Optimization: Restrict BO to the identified ROI or neighborhood, leveraging the refined surrogate and data transferred from the coarse phase.
Batch Acquisition: For simulation-based science, evaluations proceed in small batches (e.g., $q=2$ per iteration), with surrogate refitting at every round (Ray et al., 6 Jan 2025).
Termination: Optimization concludes when the budget is exhausted or the acquisition function value diminishes to near zero; non-dominated or optimal candidates are returned (Ray et al., 6 Jan 2025, Walter et al., 7 May 2025).

5. Theoretical Guarantees and Empirical Performance

Theoretical analysis indicates that coarse-to-fine BO can yield strictly improved regret bounds relative to standard GP-UCB. Key elements include:

Regret bounds: By filtering the search space and restricting surrogate fitting to smaller regions, the cumulative regret scales as $\mathcal{O}(\sqrt{T \beta_T (\gamma_{g,T} + \gamma_{\ell,T})})$ , where $\gamma_{g,T}$ and $\gamma_{\ell,T}$ are GP information gains on the global space and ROI, respectively (Zhang et al., 2023). Pruning sharply reduces both quantities.
Sample efficiency: In molecular optimization, multi-level BO requires orders of magnitude fewer high-fidelity evaluations than flat single-level BO to achieve comparable or superior optima. For example, in chemical space enumeration, a multi-level funnel approach accessed $3\times10^{-4}\%$ of a $1.37\times10^8$ -sized space—yet attained better solutions than single-level BO (Walter et al., 7 May 2025).
Accelerated convergence: Across benchmarks (function optimization, hyperparameter tuning, molecular design), coarse-to-fine methods consistently outperformed conventional BO, especially with limited evaluation budgets (Ray et al., 6 Jan 2025, Nomura et al., 2019).
Pareto front improvement: In multi-objective force-field tuning, the error with respect to AAMD reference dropped from $\sim20\text{–}30\%$ (naive parameters) to $<10\%$ after $\sim100$ simulations, with a steady shift of the Pareto front toward high-fidelity values (Ray et al., 6 Jan 2025).

6. Variants, Practical Guidance, and Applications

Representatives of the coarse-to-fine paradigm include:

Framework (Reference)	Coarse-Pruning Strategy	Fine-Stage BO
BALLET (Zhang et al., 2023)	Adaptive superlevel ROI by GP	Local GP + restricted acquisition
Multi-level CG-BO (Walter et al., 7 May 2025)	Hierarchical CG levels, latent neighborhood projection	Delta-GP across levels, local EI maximization
Parametric tuning (Ray et al., 6 Jan 2025)	Low-dim param. search over CG force fields	Multi-objective BO via MOGP, EHVI
Budgeted heuristic (Nomura et al., 2019)	Coordinate-wise subcube division	Standard BO on subcube

Parameter settings are minimal—typically lengths scales for GPs and thresholds for transitions or ROIs. For practical large-scale or high-dimensional problems, deep or sparse GPs and moderate initial designs (5–20 points) are recommended. In scientific domains, identification of a low-dimensional bottleneck parameterization is emphasized to ensure tractability (Ray et al., 6 Jan 2025).

Key application domains include polymer and materials simulation (Ray et al., 6 Jan 2025), chemical and molecular design (Walter et al., 7 May 2025), and general hyperparameter optimization in machine learning (Nomura et al., 2019). The approach has demonstrated robust generalization across diverse simulators and experimental settings.

7. Strengths, Limitations, and Extensions

Coarse-to-fine BO methodologies inherit several advantages:

Drastic reduction in search space volume/resolution—yielding superior performance in low-budget or high-dimensional settings.
Surrogates focused on the region of maximal relevance, yielding tighter uncertainty estimates, and rapid regret reduction.
Generalizability—applicable to various backend BO algorithms and seamlessly interfaced with domain-specific simulations.

Nevertheless, these frameworks may be susceptible to several limitations:

Greedy partitioning (as in (Nomura et al., 2019)) ignores potential variable correlations, risking exclusion of optima in diagonal or narrow regions.
ROI pruning may fail if initial surrogates are poorly calibrated or noisy evaluations mislead the coarse filter (Zhang et al., 2023).
Transition rules in hierarchical models depend on uncertainty thresholds that may be domain-sensitive (Walter et al., 7 May 2025).

Proposed extensions include adaptive splitting criteria, multi-tree or recursive refinement, integration with multi-fidelity modeling, and advanced acquisition strategies tuned to nonstationary or high-variance objectives. For complex simulators, an explicit identification of bottleneck parameters and space-filling initial designs is pivotal (Ray et al., 6 Jan 2025).

In summary, coarse-to-fine Bayesian optimization provides a rigorously justified, empirically validated suite of methods for scalable, data-efficient optimization in high-dimensional, heterogeneous domains, particularly where evaluation costs are prohibitive and search spaces are vast or combinatorial (Ray et al., 6 Jan 2025, Zhang et al., 2023, Walter et al., 7 May 2025, Nomura et al., 2019).