Fast Compression of 3D Gaussian Splatting

Updated 21 January 2026

The paper demonstrates that fast compression of 3D Gaussian Splatting is achieved via noise-substituted vector quantization and attribute codebooks, reducing memory by up to 45× with minimal PSNR loss.
FCGS methods employ feedforward pipelines, multi-path entropy modules, and context models to compress millions of splats in seconds while preserving near-original visual fidelity.
The approach supports mobile and embedded deployments through mixed-precision quantization and optimization-free techniques, enabling practical AR/VR streaming and storage of large-scale 3D models.

Fast Compression of 3D Gaussian Splatting (FCGS) is a class of techniques addressing the acute storage and bandwidth challenges of 3D Gaussian Splatting (3DGS) models, which represent scenes with millions of high-dimensional, explicit Gaussian primitives. FCGS compresses these large-scale models rapidly—often in seconds—while preserving near-original visual fidelity and enabling real-time rendering on commodity or low-power hardware. Approaches to FCGS utilize diverse algorithmic strategies, including noise-substituted vector quantization, attribute codebooks, channel and context modeling, pyramid sampling, point cloud coding, and entropy coding schemes, to reduce memory footprints by over an order of magnitude across both static and dynamic scenes (Wang et al., 3 Apr 2025, Chen et al., 2024, Liu et al., 30 Nov 2025).

1. Background: 3D Gaussian Splatting Model and Compression Challenges

3D Gaussian Splatting (3DGS) encodes a scene using $N$ explicit, anisotropic Gaussian splats, each parameterized by floating-point vectors for spatial position, rotation, scale, opacity, color, and high-order spherical-harmonic (SH) coefficients. Each primitive commonly requires 59 float32 parameters: 3D position $(\mu_i\in\mathbb{R}^3)$ , opacity $(o_i)$ , scale $(s_i\in\mathbb{R}^3)$ , rotation $(r_i\in\mathbb{R}^4)$ , color $(c_i\in\mathbb{R}^3)$ , and $45$ SH coefficients $(c_i^{sh}\in\mathbb{R}^{45})$ (Wang et al., 3 Apr 2025). Such models easily reach memory requirements of several hundred megabytes to gigabytes per scene, constraining their use on bandwidth-limited platforms and during network streaming.

Traditional signal-processing compression methods (e.g., vector quantization, pruning) and ML-based approaches have only partially resolved these constraints. SL-based methods risk severe fidelity loss or convergence collapse, while ML-based schemes may lose the explicit, fast-rendering properties critical to 3DGS deployment (Wang et al., 3 Apr 2025, Niedermayr et al., 2023).

2. Noise-Substituted Vector Quantization and Attribute Codebooks

A core FCGS approach introduces per-attribute vector codebooks, trained using a noise-substituted vector quantization (NSVQ) scheme (Wang et al., 3 Apr 2025). Four attribute types are quantized independently: scale $(s_i)$ , rotation $(r_i)$ , color $(c_i)$ , and SH coefficients $(c_i^{sh})$ , each with a dedicated codebook $C_{\star}$ . Instead of storing 3+4+3+45 floats per splat, each Gaussian stores only discrete indices $(k_s, k_r, k_c, k_{sh})$ , which reduces attribute storage from 236 to as few as 7–8 bytes per splat (not counting positions and opacity, which remain uncompressed for high fidelity).

Classical VQ is non-differentiable and creates “gradient collapse” in end-to-end training. NSVQ addresses this by using a differentiable surrogate in the forward pass,

$\tilde z_q = e_{k^*} + \|z - e_{k^*}\|_2 \frac{n}{\|n\|_2}, \quad n\sim \mathcal{N}(0, \sigma^2 I),$

so that gradients propagate through quantization, allowing for joint optimization of codebooks and model features.

A typical schedule includes: standard 3DGS densification and pruning; codebook initialization via $K$ -means; end-to-end NSVQ training; and post-quantization fine-tuning with frozen code assignments (Wang et al., 3 Apr 2025).

With this regime, 16k-size codebooks yield $\sim$ 45 $\times$ reduction in memory (e.g., 734 MB→16.4 MB for 30k splats), with only minor PSNR drop (e.g., $<0.5$ dB) and negligible effect on SSIM or LPIPS. These compressed formats retain compatibility with existing 3DGS viewers and rendering pipelines, requiring only a lookup-table fetch per attribute group (Wang et al., 3 Apr 2025).

3. Feedforward and Optimization-Free Compression Pipelines

Emerging methods have shifted to optimization-free, generalizable compression pipelines that trade per-scene fine-tuning for fast, feedforward encoding—compressing millions of splats in seconds (Chen et al., 2024, Liu et al., 30 Nov 2025). Key modular components include:

Multi-Path Entropy Modules (MEM): Selectively partition color attribute channels through an analysis AE or direct quantization, depending on a learnable mask, balancing rate and fidelity without requiring path-specific tuning.
Context Models: Both inter-Gaussian (across splats) and intra-Gaussian (within attribute channels) context models predict local entropy distributions, employing grid-based feature aggregation and MLPs.
Hyperprior Branch: Auxiliary analysis and quantization of latent codes for accurate probability estimation, following deep image compression precedents.
Arithmetic Coding: Entropy codes geometry (with external G-PCC) and optimized color representations, yielding $>20\times$ reduction at sub-second encoding speeds with PSNR drop $<0.5$ dB (Chen et al., 2024).

These methods consistently match or surpass the rate-distortion (RD) trade-offs of prior optimization-based codecs while being one to two orders of magnitude faster, and further boost total compression when coupled with aggressive spatial pruning or coarser attribute quantization.

4. Advanced Statistical and Context-Based Entropy Modeling

Accurate modeling of the statistics of attribute distributions is central to high-ratio, fast-encoding FCGS. Channels are found to be largely independent, with SHAC coefficients following Laplace distributions, and rotation/scaling/opacity fitting low-component Gaussian mixture models (GMMs) (Huang et al., 13 Aug 2025).

A factorized, parametric entropy-coding pipeline then applies:

Per-channel parameter estimation (Laplace for SHAC, EM-trained GMM for others).
Adaptive min-max quantization per group, with precision derived empirically from bit-sensitivity and visual distortion curves.
Probability mass function (PMF)–based arithmetic encoding.
External point cloud codecs (e.g., G-PCC) for geometry and low-dimensional DC color.

Encoding and decoding are highly parallel, with encoding times of 3–4 s per million splats and decodings in sub-second time on modern hardware, enabling online compression for streaming or large-scene management (Huang et al., 13 Aug 2025).

5. Voxelization, Transform Coding, and Point Cloud Approaches

Transform-coding approaches exploit point cloud compression tools and adaptive voxelization tailored to Gaussian geometry statistics (Wang et al., 30 May 2025, Wang et al., 21 May 2025). Key elements include:

Adaptive voxelization: Gaussian centers $\mu_i$ are partitioned adaptively by per-object scale, local density, and dispersion, minimizing the proxy 2-Wasserstein error for merged Gaussians and halving the leaf set compared to uniform grids.
Attribute initialization and fine-tuning: After merging/culling, fast retraining of the per-voxel attributes (covariance, color, opacity) is performed with fixed positions, reducing convergence to $<10$ k iterations to recover original PSNR (Wang et al., 30 May 2025).
Transform and entropy coding: Positions, SH, and opacity are quantized using G-PCC octree and RAHT transforms, with per-channel quantization for bitrate targeting.

Such methods achieve 10–15% bitrate reduction over naïve voxelization and 2–3 dB PSNR gains over generic post-training codecs (Wang et al., 30 May 2025).

AI-based point cloud geometry compressions (e.g., GausPcgc trained on Gaussian distributions) reduce geometry size by $\sim$ 8\%, with encoding/decoding latency $<$ 1 s and are plug-and-play with standard attribute codecs (Wang et al., 21 May 2025).

6. Specialized and Deployment-Focused Fast Compression Approaches

Variants focus on extreme speed, training-free operation (no retraining or per-scene optimization), and flexible user-specified RD constraints—targeted at mobile and edge deployment. Representative methods include (Tian et al., 9 Jul 2025):

Attribute-Discriminative Pruning (ADP): Prunes least-significant Gaussians and/or high-dimensional SH coefficient attributes using a global significance score derived from opacity and spatial scale.
Mixed-Precision Quantization (MPQ): Each attribute channel is assigned INT4 or INT8 depending on sensitivity to PSNR drop.
Fast Online Adaptation (FOA): A design-space search matches user targets (PSNR or bitrate), requiring only seconds per scene.
No training or large-scale datasets required: All steps operate on the pre-trained model, typically compressing a 5.8M-Gaussian scene in ~20 s to under 41 MB (95% reduction, $<1$ dB PSNR loss), with similar performance on mobile devices.

Deployments on embedded systems demonstrate this approach is up to $2\times$ faster than previous training-free methods and does not require PyTorch/TF backends (Tian et al., 9 Jul 2025).

7. Applications, Extensions, and Future Directions

FCGS methods underpin practical broadcasting, storage, and streaming of large-scale 3DGS models for AR/VR, web-based visualization, and mobile rendering. Compatibility with existing viewers is preserved via minimal pipelines changes, often only requiring code index lookups or simple de-quantization at inference (Wang et al., 3 Apr 2025, Lee et al., 6 Jan 2025).

Challenges and frontiers for future development include:

Attribute quantization for spatial coordinates and opacity, which are remained in full-precision in most FCGS schemes to avoid visual artifacts.
Handling dynamic and deformable scenes: Early work in temporal prediction modules, hash-table motion caching, and group-of-frame compressors has shown initial success with over 40 $\times$ compression and real-time decoding on 3DGS video/sequence data (Zhang et al., 8 Jul 2025, Xie et al., 2 Sep 2025, Wang et al., 11 Oct 2025).
Improved context and entropy priors: Search for 3DGS-specific hyperpriors and spatial-grid designs to capture residual redundancies beyond current spatial or channel contexts (Chen et al., 2024).
Standardization and interoperability: Increased focus on alignment with established codecs (HEVC, GPCC) and integration with conventional video processing pipelines (Lee et al., 6 Jan 2025, Xie et al., 2 Sep 2025).

In summary, FCGS embodies a broad family of algorithmic innovations allowing large-scale 3D Gaussian Splatting to be compressed by $20$– $100\times$ at negligible or gracefully controlled quality loss, all with compute and memory requirements suitable for real-time, resource-constrained, or web- and mobile-based deployments (Wang et al., 3 Apr 2025, Chen et al., 2024, Liu et al., 30 Nov 2025, Tian et al., 9 Jul 2025).