Papers
Topics
Authors
Recent
Search
2000 character limit reached

RLCT-Aware Correction for Singular Bayesian Models

Updated 10 January 2026
  • The paper introduces RLCT-aware correction that replaces the classical d/2 penalty with an RLCT-based term, ensuring asymptotically unbiased evidence estimation in singular models.
  • The approach leverages algebraic geometry to compute the effective model dimension, yielding corrections that are invariant under reparameterizations.
  • Empirical validations in linear-Gaussian rank and subspace models demonstrate that the RLCT correction eliminates the systematic over-penalization observed with traditional Laplace approximations.

RLCT-aware correction is a principled modification to Bayesian model selection criteria in the context of singular models, particularly those with overparameterization or rank-deficiency. Standard techniques like the Laplace approximation and Bayesian Information Criterion (BIC) apply a penalty based on the ambient parameter count, which leads to systematic errors in evidence evaluation when the effective model complexity is strictly lower. The RLCT-aware correction replaces the classical penalty with a term involving the real log canonical threshold (RLCT), yielding an evidence estimate that accurately tracks the true marginal likelihood asymptotics, achieves invariance under reparameterization, and rectifies the asymptotic drift observed under traditional approximations (Rao, 3 Jan 2026).

1. Real Log Canonical Threshold and Effective Dimension

In regular parametric models where the Fisher information matrix is full rank and of dimension dd, the Laplace approximation and BIC expansion for the marginal likelihood take the form: logp(Dn)=logp(Dnθ^n)d2logn+Op(1).\log p(D_n) = \log p(D_n \mid \hat\theta_n) - \frac{d}{2} \log n + O_p(1). The BIC or Laplace penalty is thus d2logn\frac{d}{2} \log n, treating dd as the number of effective 'directions' or parameters. Singular learning theory, however, demonstrates that for singular models—such as low-rank or overparameterized linear-Gaussian regression—the correct dimension is not d/2d/2, but rather the RLCT λ\lambda, a rational number quantifying the curvature of the likelihood near a Kullback-Leibler minimizer θ\theta^\star.

For these models, the precise asymptotic expansion for the marginal likelihood becomes: logp(Dn)=logp(Dnθ)λlogn+(m1)loglogn+O(1),\log p(D_n) = \log p(D_n \mid \theta^\star) - \lambda \log n + (m-1) \log \log n + O(1), where λ>0\lambda > 0 is the RLCT and mNm \in \mathbb{N} is a multiplicity factor (m=1m=1 in simple linear settings). In regular models, λ=d/2\lambda = d/2; in singular models, typically λ<d/2\lambda < d/2, signifying that only 2λ2\lambda directions induce a 12logn\frac{1}{2} \log n penalty each.

2. Limitations of the Laplace Approximation and BIC in Singular Models

The Laplace approximation and BIC, under the assumption of regularity, prescribe the penalty d2logn\frac{d}{2}\log n. When applied to singular models, they impose an excessive penalty, leading to an error in the estimated marginal likelihood given by: Laplace Error: logp(Dn)Laplogp(Dn)=(d2λ)logn+Op(1).\text{Laplace Error: } \log p(D_n)^{\text{Lap}} - \log p(D_n) = \left( \frac{d}{2} - \lambda \right) \log n + O_p(1). This excess penalty manifests as a drift in the BIC score that grows linearly in logn\log n when λ<d/2\lambda < d/2, causing systematic over-penalization and divergence from the true marginal likelihood asymptotics as sample size increases.

3. RLCT-Aware Correction: Formulation and Properties

The RLCT-aware correction directly amends the penalty term in the evidence estimate. The RLCT-corrected log-evidence is defined as: logp(Dn)RLCT:=logp(Dnθ^n)λlogn.\log p(D_n)^{\text{RLCT}} := \log p(D_n \mid \hat\theta_n) - \lambda \log n. Unlike the classical Laplace/BIC formula, this correction exactly cancels the leading asymptotic logn\log n slope when compared to the true expansion: logp(Dn)RLCTlogp(Dn)=Op(1),\log p(D_n)^{\text{RLCT}} - \log p(D_n) = O_p(1), ensuring that the RLCT error remains bounded as nn grows. In effect, the RLCT-aware correction yields an evidence estimate whose leading logn\log n asymptotics align precisely with the true marginal likelihood, eliminating the asymptotic drift.

4. Invariance under Reparameterization

A robust feature of the RLCT penalty is its invariance under reparameterization. The RLCT, grounded in algebraic geometry, is a birational invariant: it depends solely on the intrinsic structure of the model family and not on how it is parametrized. For instance, in Gaussian dictionary (subspace) models, both minimal and overcomplete representations of the same rr-dimensional subspace (e.g., DRp×rD \in \mathbb{R}^{p \times r} versus DRp×dD' \in \mathbb{R}^{p \times d'}, d>rd' > r) share the same RLCT λ=r/2\lambda = r/2, and their RLCT-corrected evidences agree up to Op(1)O_p(1): logp(Dn)RLCT(D)=logp(DnD^)r2logn,\log p(D_n)^{\rm RLCT}(D) = \log p(D_n \mid \hat D) - \tfrac r2 \log n,

logp(Dn)RLCT(D)=logp(DnD^)r2logn.\log p(D_n)^{\rm RLCT}(D') = \log p(D_n \mid \hat D') - \tfrac r2 \log n.

By contrast, the BIC-approximation would use d/2lognd/2 \log n for DD and d/2lognd'/2 \log n for DD', favoring the smaller ambient dimension despite both parametrizations defining the same model family.

5. Empirical Validation in Linear-Gaussian Rank and Dictionary Models

Closed-form analytic marginal likelihoods can be derived for linear-Gaussian rank and subspace models. For rank-rr regression with design matrix AnRn×dA_n \in \mathbb{R}^{n \times d} and a Gaussian prior θN(0,τ2Id)\theta \sim \mathcal{N}(0, \tau^2 I_d), the marginal log-likelihood is computable as: logp(Dn)=12[nlog(2π)+nlogσ2+logdet(Id+τ2σ2AnTAn)+σ2(yTyτ2σ2yTAn(Id+τ2σ2AnTAn)1AnTy)].\log p(D_n) = -\frac{1}{2}[ n\log(2\pi) + n\log\sigma^2 + \log\det(I_d + \frac{\tau^2}{\sigma^2} A_n^T A_n) + \sigma^{-2}( y^T y - \frac{\tau^2}{\sigma^2} y^T A_n (I_d + \frac{\tau^2}{\sigma^2} A_n^T A_n)^{-1} A_n^T y ) ]. Empirical studies with both rank regression and subspace models follow these steps:

  • Generate synthetic data DnD_n for increasing sample sizes.
  • Compute the exact log-evidence and two approximations (Laplace and RLCT-corrected).
  • Calculate the residual errors ΔBIC(n)\Delta_{\rm BIC}(n) and ΔRLCT(n)\Delta_{\rm RLCT}(n).
  • Estimate their slopes via linear regression on logn\log n.

The results demonstrate:

  • In singular (rank-deficient) regression (r<dr < d), the BIC error slope is empirically (dr)/2-(d-r)/2, matching the theoretical prediction.
  • The RLCT error slope remains near zero independent of the rank.
  • In regular (full-rank) regression (r=dr=d), both error slopes vanish.
  • In subspace models, different parametrizations of the same subspace yield log-evidence differences bounded by O(1)O(1) under RLCT-correction, while BIC penalizes the overcomplete representation excessively.

6. Implications and Practical Significance

The analytic and empirical results establish the necessity of replacing the conventional d/2lognd/2\log n penalty with λlogn\lambda\log n whenever the model exhibits singularities. The RLCT correction ensures that evidence estimation:

  • Remains asymptotically unbiased in singular models;
  • Reflects the effective model dimension, not the nominal parameter count;
  • Is invariant to overcomplete reparameterizations that preserve the intrinsic model.

This analysis holds in settings where the marginal likelihood is tractable, but the conceptual framework is extensible to broader singular learning-theoretic contexts. A plausible implication is the need for RLCT-aware evidence criteria in any Bayesian model class with potential singularities or identifiability defects (Rao, 3 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to RLCT-Aware Correction.