Hash Grid Feature Pruning
- The paper introduces a method that prunes unused hash grid entries using binary validity masks, reducing storage and transmission overhead.
- The technique employs multi-resolution grids and trilinear interpolation to accurately encode 3D spatial features for applications like Gaussian splatting.
- Empirical evaluations show up to 22.3% bitrate reduction while maintaining reconstruction quality, highlighting improved rate-distortion performance.
Hash grid feature pruning is a technique for reducing storage and transmission overhead in neural fields based on hash grids, particularly for applications involving Gaussian splatting and implicit scene representations. The method leverages the spatial sparsity and non-uniform distribution of target elements in the input domain to identify and eliminate unused or redundant features in the hash grid, thereby improving rate-distortion performance and computational efficiency without compromising reconstruction accuracy.
1. Hash Grid Structures for Gaussian Splatting
Hash grids encode features arising from spatial coordinates in (typically for 3D scenes). A multi-resolution hierarchical grid is constructed with levels and per-level resolution (). Each level contains table entries, with each entry representing a feature vector .
Spatial coordinates are min-max normalized and scaled to grid resolution:
For interpolation, the surrounding grid vertices are identified via component-wise flooring and offsetting by . Feature values are obtained by trilinear interpolation across these vertices, using hashed indices:
Hashing employs fixed large primes:
Outputs from all levels are summed or concatenated and input to a downstream neural decoder.
2. Identification of Valid and Invalid Hash Features
Due to the spatial sparsity and irregularity of Gaussian splats, many hash table entries—i.e., feature vectors—are never queried during inference. A binary validity mask for each hash entry, at each level , is defined as:
Features with are redundant and subject to pruning.
3. Hash Grid Feature Pruning Algorithm
Pruning proceeds in five steps:
- Scan input splat positions and grid levels .
- For each splat and level:
- Scale coordinates .
- Compute neighboring vertices .
- Hash each neighbor to obtain index .
- Set .
- Gather valid indices .
- Retain and entropy-encode ; discard others.
- At the decoder, reconstruct using decoded positions, refill the pruned feature subset.
Pseudocode:
1 2 3 4 5 6 7 8 9 10 11 |
initialize v^ℓ[0..T-1] ← 0 for all ℓ for n in 1..N: for ℓ in 1..L: X_R ← scale(X^(n), R_ℓ) for δ in {0,1}^D: X_V ← floor(X_R) + δ i ← h(X_V) v^ℓ[i] ← 1 for ℓ in 1..L: S^ℓ ← { i | v^ℓ[i]=1 } encode { f^ℓ_i for i in S^ℓ } |
Time complexity: . Space complexity: for masks and for pruned features (Ma et al., 28 Dec 2025).
4. Rate–Distortion Performance and Empirical Evaluation
The method demonstrates bitrate savings under the AVS-VRU i3DV v3.0 codec in the standard test conditions (CTC). Bitrate reduction is measured by:
Rate–distortion curves, plotting PSNR vs. bitrate, show that pruning preserves PSNR at all operating points, shifting rates downward without fidelity loss. Across diverse sequences (DanceDunhuang, ShowGroups, VRUgz, VRUdg4), pruning yields average bitrate reduction of 8%. Peak savings reach 22.3% for DanceDunhuang.
| Sequence | Base Rate (MB/f) | Pruned Rate (MB/f) | |
|---|---|---|---|
| DanceDunhuang | 0.231 | 0.198 | 15.5% |
| ShowGroups | 0.066 | 0.062 | 4.6% |
| VRUgz | 0.104 | 0.099 | 4.8% |
| VRUdg4 | 0.118 | 0.109 | 7.4% |
| Average | — | — | 8.08% |
Encoding and mask computation overhead are negligible (), with reconstruction quality—measured in PSNR and visually—identical to unpruned baselines (Ma et al., 28 Dec 2025).
5. Trainable Hash Grid Sparsification: HollowNeRF
An alternative pruning methodology is presented in HollowNeRF, which leverages a trainable 3D saliency mask embedded in the hash-grid NeRF pipeline. For a spatial point , saliency is computed via trilinear interpolation and sigmoid activation:
Features are weighted as prior to MLP decoding. Sparsity is enforced via an ADMM-based augmented Lagrangian:
Here, the photometric loss is minimized under the constraint , driving many coarse-grid saliency parameters to zero (pruning them).
Hash collisions—inherent to hashed encoding—are mitigated by steering unused capacity ( for empty voxels), focusing representational resources on visible surfaces. The SoftZeroGate ensures pruned features yield exactly zero density.
Quantitatively, HollowNeRF achieves PSNR parity or improvement compared to Instant-NGP, using only 31–56% of the parameters. For example, with hashgrid size and , PSNR is 32.53 dB (31% parameters vs. 32.29 dB in Instant-NGP). At (56% parameters), PSNR gain reaches +1 dB (Xie et al., 2023).
6. Limitations and Prospects
Hash grid feature pruning, when performed as a post-training step, is static: it does not adapt to changes in the distribution of splats at runtime. Only "hard" (binary) pruning is utilized; learning sparsity priors or soft-pruning during training remains for future work. Further integration into the entropy model, rather than via bitmap masking and post-filtering, may provide additional efficiency. Dynamic adaptation to changing grid resolution or hash size per frame is not addressed.
In trainable pruning schemes, exploration of optimal grid resolutions, sparsity budgets, and soft gating mechanisms remains ongoing. The use of end-to-end ADMM and differentiable gating (HollowNeRF) permits significantly reduced model size for high-fidelity volumetric rendering.
A plausible implication is that such pruning methodologies can be more broadly applied to learning-based compression and neural field representations wherever spatial sparsity is pronounced.