diffHOD-IA: Differentiable HOD & IA Modeling

Updated 6 February 2026

diffHOD-IA is a fully differentiable implementation of the halo occupation distribution and intrinsic alignment framework, integrating continuous relaxations for discrete galaxy sampling.
It leverages methods like Gumbel-Softmax reparameterization in JAX to enable gradient-based inference techniques such as Hamiltonian Monte Carlo, improving simulation efficiency.
The framework achieves significant speedup and high-fidelity parameter estimation, facilitating robust weak-lensing cosmology analyses with integrated galaxy clustering and orientation statistics.

diffHOD-IA is a fully differentiable implementation of the stochastic Halo Occupation Distribution (HOD) framework incorporating galaxy intrinsic alignment (IA) statistics, developed to address parameter inference and simulation-based modeling requirements for next-generation weak gravitational lensing cosmology. Built on differentiable sampling methods (notably Gumbel-Softmax relaxation and reparameterization), and implemented in JAX, diffHOD-IA enables automatic differentiation with respect to both HOD and IA parameters through the full pipeline, allowing the application of gradient-based inference algorithms such as Hamiltonian Monte Carlo (HMC) not only for galaxy number statistics but also for alignment-dependent, orientation-sensitive observables (Pandya et al., 4 Feb 2026).

1. HOD and IA Model Structure

The base HOD logic follows the widely used Zheng et al. (2007) prescription, parameterizing the mean number of central and satellite galaxies per halo as a function of halo mass: $\langle N_{\rm cen}(M)\rangle = \frac12\left[1 + \operatorname{erf}\left(\frac{\log M - \log M_{\min}}{\sigma_{\log M}}\right)\right]$ Central galaxy occupation is sampled via a (possibly relaxed) Bernoulli process. Satellites are assigned with mean

$\langle N_{\rm sat}(M)\rangle = \langle N_{\rm cen}(M)\rangle \left(\frac{M-M_0}{M_1}\right)^\alpha$

with samples drawn from a Poisson or (in the differentiable relaxation) a Binomial distribution. Central locations are set to halo centers; satellites are assigned to subhalos (ranked by $m_{\text{peak}}$ ), falling back to NFW halo profiles when necessary.

The IA component models the alignment of galaxy shapes with halo properties using a Dimroth–Watson (“spin-2 symmetric maximum-entropy”) distribution. The misalignment angle $\theta$ has probability density

$P(\theta,\phi)\,d\theta\,d\phi = \frac{B(\kappa)}{2\pi} \exp[-\kappa \cos^2\theta] \sin\theta\,d\theta\,d\phi$

where $\kappa=-\tan\left(\frac{\pi}{2}\mu\right)$ and $(\mu_{\rm cen}, \mu_{\rm sat})$ control the degree of alignment for centrals and satellites. These parameters adjust the orientation statistics that are central to weak-lensing contaminant modeling.

2. Differentiable Stochastic Sampling and Implementation

diffHOD-IA leverages continuous relaxations to render discrete galaxy sampling differentiable. For central galaxies, the relaxed Bernoulli (a.k.a. BinConcrete) reparameterization is

$N_{\rm cen}^{\rm relax} = \sigma\left(\frac{\log(p/(1-p)) + \epsilon}{\tau}\right), \;\; \epsilon\sim\operatorname{Logistic}(0,1),\;\; \tau=0.1$

For satellites, Poisson draws are replaced by binomial relaxation: set $N_{\max}$ satellite slots (fiducially $N_{\max}=48$ ), each assigned via an independent relaxed Bernoulli with $\langle N_{\rm sat}(M)\rangle = \langle N_{\rm cen}(M)\rangle \left(\frac{M-M_0}{M_1}\right)^\alpha$ 0. This guarantees differentiability with respect to HOD parameters across all stochastic catalog realizations.

Sampling NFW radial positions and IA misalignment angles proceeds via differentiable inverse-CDF sampling using Newton's method. Assignment of satellites to subhalos employs a softmax over ranked subhalo lists: $\langle N_{\rm sat}(M)\rangle = \langle N_{\rm cen}(M)\rangle \left(\frac{M-M_0}{M_1}\right)^\alpha$ 1 All computational components use JAX primitives with jax.jit and vmap for efficient vectorization and parallelization.

3. Correlation Functions and Summary Statistics

diffHOD-IA computes both standard galaxy clustering and IA summary statistics in a differentiable fashion, enabling gradient-based optimization or sampling of IA parameters with respect to observed two-point functions. Let $\langle N_{\rm sat}(M)\rangle = \langle N_{\rm cen}(M)\rangle \left(\frac{M-M_0}{M_1}\right)^\alpha$ 2 index galaxy pairs with spatial separation $\langle N_{\rm sat}(M)\rangle = \langle N_{\rm cen}(M)\rangle \left(\frac{M-M_0}{M_1}\right)^\alpha$ 3 within a radial bin $\langle N_{\rm sat}(M)\rangle = \langle N_{\rm cen}(M)\rangle \left(\frac{M-M_0}{M_1}\right)^\alpha$ 4. The following statistics are implemented:

Galaxy–galaxy autocorrelation:

$\langle N_{\rm sat}(M)\rangle = \langle N_{\rm cen}(M)\rangle \left(\frac{M-M_0}{M_1}\right)^\alpha$ 5

with $\langle N_{\rm sat}(M)\rangle = \langle N_{\rm cen}(M)\rangle \left(\frac{M-M_0}{M_1}\right)^\alpha$ 6 the weighted pair count from the simulated catalog and $\langle N_{\rm sat}(M)\rangle = \langle N_{\rm cen}(M)\rangle \left(\frac{M-M_0}{M_1}\right)^\alpha$ 7 the analytic random pair count.

Position–orientation correlation (galaxy–shape, or $\langle N_{\rm sat}(M)\rangle = \langle N_{\rm cen}(M)\rangle \left(\frac{M-M_0}{M_1}\right)^\alpha$ 8):

$\langle N_{\rm sat}(M)\rangle = \langle N_{\rm cen}(M)\rangle \left(\frac{M-M_0}{M_1}\right)^\alpha$ 9

Shape–shape (orientation–orientation, or $m_{\text{peak}}$ 0):

$m_{\text{peak}}$ 1

All statistics are automatically differentiable with respect to the underlying HOD and IA parameters.

4. Gradient Validation and Inference

diffHOD-IA provides end-to-end differentiability not only through HOD parameterizations but also through IA orientation sampling, as validated by comparison of autodiff gradients with finite-difference estimates. For both $m_{\text{peak}}$ 2 and IA parameter partials (e.g., $m_{\text{peak}}$ 3 for misalignment PDF), numerical (finite-difference) and autodiff (JAX) gradients were found to be in excellent agreement (Pandya et al., 4 Feb 2026). This supports the use of diffHOD-IA within sophisticated inference workflows.

Gradient-based inference workflows enabled include:

Moment-matching optimization for IA parameters $m_{\text{peak}}$ 4 to match target orientation distributions,
Two-point correlation function fitting by minimizing loss over $m_{\text{peak}}$ 5 with inverse-variance weighting,
Hamiltonian Monte Carlo (HMC) using the NumPyro NUTS algorithm for posterior inference, leveraging JAX autodiff gradients for efficiency.

In application to mock catalogs (Bolshoi-Planck, tng300-matched HOD), posteriors for IA parameters were in close agreement with reference implementations and neural-network emulators, but at orders-of-magnitude lower computational cost.

Method	$m_{\text{peak}}$ 6	$m_{\text{peak}}$ 7
halotools-IA	$m_{\text{peak}}$ 8	$m_{\text{peak}}$ 9
IAEmu	$\theta$ 0	$\theta$ 1
diffHOD-IA	$\theta$ 2	$\theta$ 3

Wall-clock performance for diffHOD-IA HMC on an NVIDIA A100 is $\theta$ 45 min (4 chains, 1,500 steps), compared to 1 day for halotools-IA MCMC on 150 CPU cores.

5. Comparison with Reference Implementations

Extensive validation was performed against halotools-IA (an industry-standard non-differentiable HOD+IA simulator) and a neural-network–based emulator, IAEmu. Across 100 realizations, diffHOD-IA reproduces not only the mean galaxy count but also one-point and two-point statistics ( $\theta$ 5, $\theta$ 6, $\theta$ 7) to within sample variance. Posterior inferences for IA parameters match across methods to within $\theta$ 8 (Pandya et al., 4 Feb 2026).

Unlike emulator-based approaches, diffHOD-IA enables differentiability of the full galaxy catalog realization, supporting integration with field-level inference pipelines and extension to arbitrary differentiable summary statistics.

6. Implementation, Applications, and Extensions

The diffHOD-IA package is open source (https://github.com/snehjp2/diffHOD-IA) and built in JAX for both research reproducibility and high-performance autodiff. Interfaces are included for

Bolshoi-Planck and tng300 halo/galaxy catalogs,
KD-tree–based neighbor search using SciPy,
NumPyro for gradient-based HMC.

Potential extensions—many under active development—include generalization to assembly bias HODs, alternative IA prescriptions (including radial and shape alignment), 2D projected statistics, and higher-order or field-level statistics via differentiable gravity solvers and halo finders.

A plausible implication is that the diffHOD-IA framework provides a scalable and extensible foundation for simulation-based inference in cosmological analyses where both galaxy-halo connection and orientation-dependent systematics are crucial. By exposing gradients for arbitrary summary statistics, it enables rapid, high-fidelity inference workflows unachievable with traditional MCMC-based catalog samplers (Pandya et al., 4 Feb 2026, Horowitz et al., 2022).

7. Significance and Context

diffHOD-IA demonstrates the feasibility and utility of making all components of a stochastic forward galaxy population model differentiable at the catalog level. This allows the use of powerful gradient-based inference techniques (e.g., HMC), resulting in 8–20 $\theta$ 9 speedup in convergence and 8 $P(\theta,\phi)\,d\theta\,d\phi = \frac{B(\kappa)}{2\pi} \exp[-\kappa \cos^2\theta] \sin\theta\,d\theta\,d\phi$ 0 greater effective sample size per gradient evaluation compared with Metropolis MCMC, as established in prior studies for HOD-only models (Horowitz et al., 2022). The inclusion of intrinsic alignments broadens its applicability to weak-lensing cosmology, where mitigation of IA systematics is essential. The framework lays groundwork for embedding physically motivated HOD and IA prescriptions within high-dimensional, fully differentiable simulation pipelines for joint inference over cosmological and galaxy formation parameters.

Markdown Report Issue Upgrade to Chat

References (2)

Differentiable Stochastic Halo Occupation Distribution with Galaxy Intrinsic Alignments (2026)

Differentiable Stochastic Halo Occupation Distribution (2022)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to diffHOD-IA.