Bayes-Optimal Per-Query Gating
- The paper establishes a Bayesian framework for per-query gating that selects between language model predictions and retrieval evidence using risk minimization.
- It leverages entropy pursuit and trust-based penalties to balance accuracy and reliability in both active preference learning and retrieval-augmented generation.
- Empirical results demonstrate computational efficiency and improved factuality, validating hybrid geometric-semantic approaches under varied query conditions.
A Bayes-optimal per-query gate is a statistical mechanism for selecting, on a query-by-query basis, between competing sources of prediction or evidence under a formal Bayesian risk-minimization criterion. Such gates have been established in both active preference learning ("Bayes-Optimal Entropy Pursuit" (Pallone et al., 2017)) and retrieval-augmented generation (RAG) settings ("A Note on k-NN Gating in RAG" (Biau et al., 20 Jan 2026)). In these frameworks, the gate determines which information source—internal model or retrieved memory—should be trusted for each query, taking into account uncertainty, data quality, and downstream objectives such as entropy reduction, misclassification error, or factuality.
1. Mathematical Frameworks for Per-Query Gating
In active choice-based preference learning (Pallone et al., 2017), the system models user preferences via a linear classifier with a Bayesian prior . At each time , an -way choice query is constructed, and the user selects the most preferred item. Observation likelihoods are specified by a noise-channel matrix . The posterior on is updated using Bayes’ rule given the user's possibly noisy response.
In retrieval-augmented language modeling (Biau et al., 20 Jan 2026), a query has an unknown label and is processed by both:
- A frozen base LM, yielding ,
- A -NN retriever on a memory bank, producing .
A gating function yields the prediction mixture: The gate is optimized to minimize expected cross-entropy to the ground-truth conditional distribution, penalized by a retrieval-trust term .
2. Bayes-Optimal Policy Derivation
The Bayes-optimal per-query gate is derived by minimizing a risk or loss function that is pointwise decomposable in (Biau et al., 20 Jan 2026). For each ,
where is a regularization parameter and measures retrieval reliability: For hard gating, the Bayes-optimal rule is: where and are the population cross-entropies for the LM and retriever, respectively.
In the entropy pursuit setting (Pallone et al., 2017), the Bayes-optimal policy for query selection is provably greedy with respect to mutual information, i.e., at each step it selects the query that maximizes expected posterior entropy reduction (equivalently, mutual information between preference vector and observation).
3. Role of Trust, Penalization, and Memory Alignment
The retrieval-trust weight encodes the geometric reliability of retrieved evidence. It approaches 1 in dense, in-distribution regions and falls toward 0 for out-of-support or noisy queries. The penalty in the gating loss discourages the gate from relying on retrieval in low-trust regions, thus providing a statistical guard against spurious or misleading evidence (Biau et al., 20 Jan 2026).
A hybrid geometric-semantic model accounts for both covariate shift (in vs. reference ) and label corruption in memory, modulating both retrieval distribution and associated trust quantities. Under such shifts, decays exponentially in the distance from to the memory support, ensuring the optimal gate contracts toward baseline model reliance in off-support or adversarial regions.
4. Information-Theoretic Objectives and Guarantees
Bayes-optimal per-query gates often optimize information-theoretic objectives:
- In entropy pursuit (Pallone et al., 2017), the posterior differential entropy is minimized; the expected one-step entropy reduction equals the mutual information between observation and latent parameter.
- The maximal per-step entropy reduction is bounded by the "channel capacity" determined by the predictive distribution and noise channel: If query alternatives can be constructed from a continuum, greedy entropy pursuit attains the linear rate of entropy decrease: . Sensitivity results ensure robust performance even when the attained predictive distribution only approximates the global optimum.
Misclassification error is fundamentally lower-bounded by posterior entropy via Fano's inequality: indicating that per-query Bayesian entropy control directly governs error rates.
5. Statistical Hallucination, Discordance, and Large-Sample Limits
A discordance-based hallucination criterion quantifies local disagreement between LM predictions and retrieval evidence, weighted by retrieval trust (Biau et al., 20 Jan 2026): where is the modal retriever label. The optimal gating solution reduces this discordance only if retrieval meaningfully improves over the LM in well-trusted regions. Asymptotically, in the aligned regime with , the empirical retriever converges to the true conditional and gates nontrivially only at points where the Bayes error is strictly less than LM error.
A plausible implication is that capacity for hallucination mitigation via retrieval is structurally limited by the agreement between the LM and the true Bayes rule in high-density regions.
6. Empirical Performance and Computational Considerations
In choice-based preference learning (Pallone et al., 2017), empirical evaluation on large document sets demonstrates that entropy-pursuit outperforms the knowledge-gradient (KG) policy on posterior entropy and is significantly more computationally efficient. For misclassification error, KG may yield marginal gains in low-noise, weak-prior regimes, but both policies are nearly indistinguishable in regimes with moderate noise or strong prior. The computational advantage of entropy-pursuit stems from needing only vs. candidate sets in KG.
7. Scope, Limitations, and Generalizations
The Bayes-optimal per-query gating formalism is broadly applicable wherever two or more sources of predictive evidence must be reconciled at inference time under uncertainty. In both the preference learning and RAG contexts, generalizations to hybrid models—combining geometric and semantic corruption—are mathematically natural via the proposed trust and reliability terms. However, performance depends on adequate estimation of underlying densities, calibration of penalty hyperparameters (), and availability of high-quality memory support.
In sum, the Bayes-optimal per-query gate provides a principled, risk-minimizing mechanism for balancing model fluency, retrieval grounding, and statistical reliability on a per-query basis, with strong information-theoretic and statistical guarantees in a variety of learning and inference frameworks (Pallone et al., 2017, Biau et al., 20 Jan 2026).