Papers
Topics
Authors
Recent
Search
2000 character limit reached

SqueezeNet Fire Module: Efficient CNN Block

Updated 4 January 2026
  • SqueezeNet Fire Module is a convolutional block that uses a squeeze-and-expand strategy to drastically reduce parameters while preserving accuracy.
  • It balances 1×1 and 3×3 convolutions to minimize computational cost and memory usage, making it ideal for mobile and embedded devices.
  • Variants like Fire SSD and Wide Fire Module further optimize grouping and parameter scaling, achieving competitive performance with significantly fewer resources.

The SqueezeNet Fire Module is a convolutional building block that enables AlexNet-level accuracy with 50× fewer parameters via channel-efficient microarchitecture. First introduced in SqueezeNet (Iandola et al., 2016), and later adapted in variants such as Fire SSD (Liau et al., 2018), it achieves drastic reductions in parameter count and memory footprint by aggressively replacing expensive 3×3 convolutions with 1×1 convolutions and minimizing the channels flowing into any required 3×3 filters. The design enables high expressivity with compressibility and is especially well-suited for resource-constrained devices and efficient deployment scenarios.

1. Motivation and Design Principles

Modern CNNs place the majority of their parameters in 3×3 convolutional layers, where a single 3×3 convolution with CinC_{in} input and CoutC_{out} output channels yields CinCout9C_{in} \cdot C_{out} \cdot 9 weights. SqueezeNet targets model efficiency via two explicit strategies: (1) replacing 3×3 filters with 1×1 filters where feasible (1×1 filters have 9× fewer weights than 3×3), and (2) reducing the number of input channels to the 3×3 convolutions. The Fire module operationalizes these strategies by first "squeezing" input channels via 1×1 convolutions, then "expanding" with a parallel combination of 1×1 and 3×3 filters, maintaining spatial coverage with tightly controlled parameter budgets (Iandola et al., 2016).

2. Formal Architecture of the Fire Module

A Fire module consists of two ordered stages:

  • Squeeze Layer: Performs 1×1 convolution across CinC_{in} input channels, producing s1s_1 output channels. This layer compresses the representation.
    • Parameters: Cins1C_{in} \cdot s_1
  • Expand Layer: Splits the s1s_1-channel output into two branches:
    • (i) 1×1 convolution with e1e_1 filters: s1e1s_1 \cdot e_1 parameters.
    • (ii) 3×3 convolution with e3e_3 filters (padding=1): 9s1e39 \cdot s_1 \cdot e_3 parameters.
    • The outputs of both branches are concatenated channel-wise, yielding e1+e3e_1 + e_3 output channels.

The total parameter count per module is:

P=Cins1+s1e1+9s1e3=s1(Cin+e1+9e3)P = C_{in} \cdot s_1 + s_1 \cdot e_1 + 9 \cdot s_1 \cdot e_3 = s_1 \cdot (C_{in} + e_1 + 9\cdot e_3)

Scaling s1s_1 linearly increases all three terms. Shifting expand filters from 3×3 (e3e_3) to 1×1 (e1e_1) substantially reduces PP due to the factor-of-9 savings (Iandola et al., 2016).

3. Hyperparameters and Instantiation in SqueezeNet and Fire SSD

SqueezeNet deploys eight Fire modules (fire2–fire9), each with preset values for s1s_1, e1e_1, e3e_3, and output channels as illustrated below:

Module squeeze s1s_1 expand e1e_1 expand e3e_3 output channels
fire2 16 64 64 128
fire3 16 64 64 128
fire4 32 128 128 256
fire5 32 128 128 256
fire6 48 192 192 384
fire7 48 192 192 384
fire8 64 256 256 512
fire9 64 256 256 512

Increasing the squeeze ratio (s1/Cins_1/C_{in}) grows the parameter count and model size, with accuracy gains saturating near SR=0.75SR=0.75. Distributing e1e_1 and e3e_3 equally (approximate 50:50 split) is empirically near-optimal for accuracy but adding more 3×3 filters gives diminishing returns due to their multiplicative parameter cost (Iandola et al., 2016).

Fire SSD adapts the Fire module with S=Cin/4S = C_{in}/4, E1=E3=Cin/2E_1 = E_3 = C_{in}/2, with group convolutions in expand branches (see Section 4). The parameter count in this version is PFire=1.5Cin2P_{Fire} = 1.5\,C_{in}^2, which is $1/6$ the parameter and FLOP cost of a plain CinCinC_{in}\to C_{in} 3×3 convolution (Liau et al., 2018):

  • Original Fire: 1.5Cin21.5\,C_{in}^2 parameters and same in MACs per spatial map.
  • 3×3 Conv: 9Cin29\,C_{in}^2 parameters and MACs.

4. Wide Fire Module (WFM) Variant and Computational Analysis

Fire SSD introduced the Wide Fire Module, further improving efficiency by replacing both expand branches with group convolutions:

  • Architecture:
    • Squeeze: 1×11\times 1 conv, S=Cin/4S = C_{in}/4.
    • Expand 1×1: group conv (G1=2G_1 = 2 groups), E1=Cin/2E_1 = C_{in}/2.
    • Expand 3×3: group conv (G3=16G_3 = 16 groups), E3=Cin/2E_3 = C_{in}/2, padding=1.
    • Concatenation yields CinC_{in} output channels.
  • Parameter Formula:

    PWFM=CinS+SE1G1+9SE3G3P_{WFM} = C_{in} \cdot S + \frac{S \cdot E_1}{G_1} + 9 \frac{S \cdot E_3}{G_3}

  • Efficiency Example (Cin=512C_{in}=512):
    • WFM: 100,352 params vs. Classic Fire: 393,216 (74.5% reduction).

Group convolution in expand branches prevents over-fragmentation (maintaining G1=2G_1=2 in 1×1 branch) and optimally balances receptive field and grouping (G3=16G_3=16 in the 3×3 branch), leading to parameter and MAC reductions while preserving accuracy (Liau et al., 2018).

5. Empirical and Quantitative Performance Assessment

SqueezeNet achieves uncompressed model size of ~4.8 MB (≈50× smaller than AlexNet), compressible to 0.47 MB via pruning and quantization (Deep Compression), and can be fully stored on-chip in FPGA deployments, removing off-chip bandwidth constraints (Iandola et al., 2016). Performance trade-offs indicated include:

  • Fewer parameters yield less DRAM traffic and faster inference on CPUs/GPUs.
  • Squeeze ratio sweeps (0.1250.750.125 \to 0.75): accuracy (top-5, ImageNet) rises from 80.3% to 86.0%, model size from 4.8 MB to 19 MB.
  • 3×3 filter fraction sweeps show accuracy plateaus near 50% split.
  • Bypass connections around Fire modules increase top-1 accuracy from 57.5% to 60.4% at zero extra parameters.

In Fire SSD, quantitative results include:

  • Fire SSD: 2.67 G MACs, 7.13 M params, 70.5 mAP (Pascal VOC 2007).
  • SSD+SqueezeNet: 1.18 G MACs, 5.53 M params, 64.3 mAP.
  • SSD+MobileNet: 1.15 G MACs, 5.77 M params, 68.0 mAP.
  • YOLO v2: 8.36 G MACs, 67.1 M params, 69.0 mAP.
  • Tiny YOLO v2: 3.49 G MACs, 15.9 M params, 57.1 mAP.

Inference speed (Intel NUC, batch=1, Fire SSD): 31.7 FPS (CPU, OpenVINO), 39.8 FPS (GPU, FP16), with model size ≈28 MB (Liau et al., 2018).

6. Applications and Guidelines for Resource-Constrained Deployment

Fire modules are exposed via three principal hyperparameters (s1s_1, e1e_1, e3e_3), allowing transparent accuracy–parameter trade-offs. For different deployment scenarios:

  • Mobile/embedded:
    • Low squeeze ratio (SR=0.125SR=0.125–$0.25$) to minimize model size.
    • e350%e_3 \approx 50\%—enough spatial coverage, minimal parameter overhead.
    • Optional residual bypass to recover accuracy at no parameter cost.
  • FPGA/ASIC:
    • Keep all weights on chip; target <<8 MB models.
    • Favor 1×1 filters to lower multiplier resources and power; aggressively quantize/prune 3×3 branch.
  • Low-latency inference:
    • Maximize 1×1 convolutions for matrix kernel efficiency.
    • Empirically, total expand size (e1+e3e_1 + e_3) saturates accuracy at 256–512 in middle layers (Iandola et al., 2016).

Peripheral enhancements in Fire SSD include Dynamic Residual mbox Detection (stacking/connecting WFMs for gradient flow) and Normalization and Dropout Module (batch norm and dropout after each branch), which restore and exceed accuracy lost to aggressive grouping (Liau et al., 2018).

7. Context and Extensions

The SqueezeNet Fire module has served as a foundation for subsequent "lightweight" architectures, notably Fire SSD, which adapts the module with grouped convolutions to bolster model cardinality and further reduce computational burden on edge devices. This adaptation preserves the central squeeze–expand motif while implementing advancements grounded in efficient design rules (e.g., balancing group counts and receptive fields, optimizing macroarchitecture), demonstrating the module’s versatility and extensibility across disparate CV pipelines (Iandola et al., 2016, Liau et al., 2018).

A plausible implication is that the parametrically transparent, efficiency-tuned framework of Fire modules sets a precedent for ongoing innovations in memory- and compute-constrained deep learning, fostering compact, high-performance models that remain amenable to both compression and hardware specialization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SqueezeNet Fire Module.