Papers
Topics
Authors
Recent
Search
2000 character limit reached

Hash Grid Feature Pruning

Updated 4 January 2026
  • The paper introduces a method that prunes unused hash grid entries using binary validity masks, reducing storage and transmission overhead.
  • The technique employs multi-resolution grids and trilinear interpolation to accurately encode 3D spatial features for applications like Gaussian splatting.
  • Empirical evaluations show up to 22.3% bitrate reduction while maintaining reconstruction quality, highlighting improved rate-distortion performance.

Hash grid feature pruning is a technique for reducing storage and transmission overhead in neural fields based on hash grids, particularly for applications involving Gaussian splatting and implicit scene representations. The method leverages the spatial sparsity and non-uniform distribution of target elements in the input domain to identify and eliminate unused or redundant features in the hash grid, thereby improving rate-distortion performance and computational efficiency without compromising reconstruction accuracy.

1. Hash Grid Structures for Gaussian Splatting

Hash grids encode features arising from spatial coordinates in D=3D=3 (typically for 3D scenes). A multi-resolution hierarchical grid is constructed with LL levels and per-level resolution RR_\ell (R=Rmin21R_\ell = R_{min} \cdot 2^{\ell-1}). Each level \ell contains TT table entries, with each entry representing a feature vector fiRFf^\ell_i \in \mathbb{R}^F.

Spatial coordinates X=(x,y,z)X = (x, y, z) are min-max normalized and scaled to grid resolution:

XR=(x,y,z)XminXmaxXminRX_R^\ell = \frac{(x, y, z) - X_{min}}{X_{max} - X_{min}} \cdot R_\ell

For interpolation, the surrounding 2D=82^D=8 grid vertices XV,kX_{V,k}^\ell are identified via component-wise flooring and offsetting by δx,δy,δz{0,1}\delta_x,\delta_y,\delta_z \in \{0,1\}. Feature values are obtained by trilinear interpolation across these vertices, using hashed indices:

f(X)=k=18wkfh(XV,k)f^\ell(X) = \sum_{k=1}^8 w_k \cdot f^\ell_{h(X_{V,k}^\ell)}

Hashing employs fixed large primes:

h(XV)=((xVπx)(yVπy)(zVπz))modTh(X_V) = ((x_V \cdot \pi_x) \oplus (y_V \cdot \pi_y) \oplus (z_V \cdot \pi_z)) \mod T

Outputs from all levels are summed or concatenated and input to a downstream neural decoder.

2. Identification of Valid and Invalid Hash Features

Due to the spatial sparsity and irregularity of Gaussian splats, many hash table entries—i.e., feature vectors—are never queried during inference. A binary validity mask viv^\ell_i for each hash entry, at each level \ell, is defined as:

vi={1if X such that i{h(XV,k)  k=1,,2D} 0otherwisev^\ell_i = \begin{cases} 1 & \text{if } \exists X \text{ such that } i \in \{h(X_{V,k}^\ell) \ |\ k=1,\ldots,2^D\} \ 0 & \text{otherwise} \end{cases}

Features fif^\ell_i with vi=0v^\ell_i = 0 are redundant and subject to pruning.

3. Hash Grid Feature Pruning Algorithm

Pruning proceeds in five steps:

  1. Scan input splat positions X(n),n=1..NX^{(n)}, n=1..N and grid levels =1..L\ell=1..L.
  2. For each splat and level:
    • Scale coordinates XR(n,)X_R^{(n,\ell)}.
    • Compute 2D2^D neighboring vertices XV,k(n,)X_{V,k}^{(n,\ell)}.
    • Hash each neighbor to obtain index i=h(XV,k(n,))i = h(X_{V,k}^{(n,\ell)}).
    • Set vi1v^\ell_i \leftarrow 1.
  3. Gather valid indices S={ivi=1}S^\ell = \{i | v^\ell_i=1\}.
  4. Retain and entropy-encode {fiiS}\{f^\ell_i | i \in S^\ell\}; discard others.
  5. At the decoder, reconstruct SS^\ell using decoded positions, refill the pruned feature subset.

Pseudocode:

1
2
3
4
5
6
7
8
9
10
11
initialize v^ℓ[0..T-1]  0 for allfor n in 1..N:
    forin 1..L:
        X_R  scale(X^(n), R_ℓ)
        for δ in {0,1}^D:
            X_V  floor(X_R) + δ
            i  h(X_V)
            v^ℓ[i]  1
forin 1..L:
    S^ℓ  { i | v^ℓ[i]=1 }
    encode { f^ℓ_i for i in S^ℓ }

Time complexity: O(NL2D)\mathcal{O}(N \cdot L \cdot 2^D). Space complexity: O(TL)\mathcal{O}(T \cdot L) for masks and O(SL)\mathcal{O}(|S^\ell| \cdot L) for pruned features (Ma et al., 28 Dec 2025).

4. Rate–Distortion Performance and Empirical Evaluation

The method demonstrates bitrate savings under the AVS-VRU i3DV v3.0 codec in the standard test conditions (CTC). Bitrate reduction is measured by:

ΔR/R=RbaseRprunedRbase\Delta R / R = \frac{R_{base} - R_{pruned}}{R_{base}}

Rate–distortion curves, plotting PSNR vs. bitrate, show that pruning preserves PSNR at all operating points, shifting rates downward without fidelity loss. Across diverse sequences (DanceDunhuang, ShowGroups, VRUgz, VRUdg4), pruning yields average bitrate reduction of 8%. Peak savings reach 22.3% for DanceDunhuang.

Sequence Base Rate (MB/f) Pruned Rate (MB/f) ΔR/R\Delta R / R
DanceDunhuang 0.231 0.198 15.5%
ShowGroups 0.066 0.062 4.6%
VRUgz 0.104 0.099 4.8%
VRUdg4 0.118 0.109 7.4%
Average 8.08%

Encoding and mask computation overhead are negligible (O(NL)\mathcal{O}(N \cdot L)), with reconstruction quality—measured in PSNR and visually—identical to unpruned baselines (Ma et al., 28 Dec 2025).

5. Trainable Hash Grid Sparsification: HollowNeRF

An alternative pruning methodology is presented in HollowNeRF, which leverages a trainable 3D saliency mask GRT×T×TG \in \mathbb{R}^{T \times T \times T} embedded in the hash-grid NeRF pipeline. For a spatial point xx, saliency p(x)(0,1)p(x) \in (0,1) is computed via trilinear interpolation and sigmoid activation:

p(x)=σ(i=18wigi)p(x) = \sigma\left( \sum_{i=1}^8 w_i g_i \right)

Features are weighted as v(x)=p(x)f(x)v(x) = p(x) f(x) prior to MLP decoding. Sparsity is enforced via an ADMM-based augmented Lagrangian:

Laug(W,H,G;γ)=L(W,H,G)+γ(σ(G)1C)+ρ2[max(σ(G)1C,0)]2L_{aug}(W, H, G; \gamma) = L(W, H, G) + \gamma(\|\sigma(G)\|_1 - C) + \frac{\rho}{2} \big[ \max(\|\sigma(G)\|_1 - C, 0) \big]^2

Here, the photometric loss LL is minimized under the constraint σ(G)1C\|\sigma(G)\|_1 \leq C, driving many coarse-grid saliency parameters to zero (pruning them).

Hash collisions—inherent to hashed encoding—are mitigated by steering unused capacity (p(x)0p(x) \to 0 for empty voxels), focusing representational resources on visible surfaces. The SoftZeroGate g^(v)=tanh(αv2)\hat{g}(v) = \tanh(\alpha \|v\|_2) ensures pruned features yield exactly zero density.

Quantitatively, HollowNeRF achieves PSNR parity or improvement compared to Instant-NGP, using only 31–56% of the parameters. For example, with hashgrid size 2172^{17} and C=0.04C=0.04, PSNR is 32.53 dB (31% parameters vs. 32.29 dB in Instant-NGP). At 2182^{18} (56% parameters), PSNR gain reaches +1 dB (Xie et al., 2023).

6. Limitations and Prospects

Hash grid feature pruning, when performed as a post-training step, is static: it does not adapt to changes in the distribution of splats at runtime. Only "hard" (binary) pruning is utilized; learning sparsity priors or soft-pruning during training remains for future work. Further integration into the entropy model, rather than via bitmap masking and post-filtering, may provide additional efficiency. Dynamic adaptation to changing grid resolution or hash size per frame is not addressed.

In trainable pruning schemes, exploration of optimal grid resolutions, sparsity budgets, and soft gating mechanisms remains ongoing. The use of end-to-end ADMM and differentiable gating (HollowNeRF) permits significantly reduced model size for high-fidelity volumetric rendering.

A plausible implication is that such pruning methodologies can be more broadly applied to learning-based compression and neural field representations wherever spatial sparsity is pronounced.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Hash Grid Feature Pruning.