Spatial Continuous Feature Extractor (SCFE)

Updated 29 January 2026

SCFE is a neural module that creates context-rich, spatially coherent features for volumetric and surface data, ensuring continuity across geometric structures.
It converts 3D feature tensors into 1D sequences via resolution-aware flattening, enabling efficient long-range dependency modeling in segmentation tasks.
SCFE’s adaptable design—including state-space blocks and field convolution—improves segmentation metrics and intrinsic surface learning across diverse datasets.

A Spatial Continuous Feature Extractor (SCFE) is a neural module designed to produce context-rich, spatially-coherent feature embeddings for high-dimensional geometric or volumetric data. SCFE architectures emphasize continuity: every location in the input domain (voxel grid or surface manifold) receives a descriptor encoding both local and global structure, while preserving geometric relationships. SCFE instances appear in volumetric segmentation frameworks, such as NeuroMamba’s global context module, and in intrinsic surface learning via field convolution. This article details the principles, architectural design, empirical results, and implementation specifics of representative SCFE mechanisms in both 3D volume and surface settings.

1. SCFE in Volumetric Segmentation Architecture

In the context of neuron segmentation using 3D electron microscopy (EM) data, SCFE is implemented within the NeuroMamba framework (Jiang et al., 22 Jan 2026). Its main function is to enable long-range voxel dependency modeling—essential for reconstructing neuronal arbors—without decomposing the volume into disconnected patches, a process that often disrupts continuity. Within the Multi-Perspective Feature Interaction (MPFI) block, the SCFE is paired with the Boundary Discriminative Feature Extractor (BDFE) for a dual approach: BDFE sharpens membrane boundaries and local details, while SCFE ensures global structural coherence.

2. Architectural and Computational Design

Within NeuroMamba, SCFE receives a 3D feature tensor $[D \times H \times W \times C]$ and performs resolution-aware flattening to convert the block into four distinct 1D sequences:

Transverse-First Scans: flattened in $(x \rightarrow y \rightarrow z)$ and $(y \rightarrow x \rightarrow z)$ order
Axial-First Scans: flattened in $(z \rightarrow x \rightarrow y)$ and $(z \rightarrow y \rightarrow x)$ order

Each sequence is fed into a Visual-Mamba block, a state-space-model-based layer with linear time complexity. The outputs are linearly combined using data-driven weights $\lambda_1$ (transverse scans) and $\lambda_2$ (axial scans):

$X_{\mathrm{global}} = \lambda_{1}\;M(z^{t}_1) + \lambda_{1}\;M(z^{t}_2) + \lambda_{2}\;M(z^{a}_1) + \lambda_{2}\;M(z^{a}_2)$

Weights are set as $\lambda_1 + \lambda_2 = 2$ , with $\lambda_2 = \alpha D_{\mathrm{ani}} + \beta$ , and the anisotropy degree $D_{\mathrm{ani}} = R_a/R_t$ (ratio of axial to transverse voxel spacing). After summation, the 1D embedding is reshaped back to $[D \times H \times W \times C]$ . This workflow preserves voxel-to-voxel relationships across the entire volume.

3. Resolution-Aware Scanning and Adaptivity

SCFE’s scanning strategy directly addresses the anisotropy in EM imaging (e.g., $z$ -spacing $\neq$ $x,y$ -spacing). By computing $D_{\mathrm{ani}}$ and adjusting $\lambda_2$ , the module emphasizes the lower-resolution axis when necessary. For instance, in highly anisotropic datasets (AC3/AC4: $z \approx 29$ nm, $x,y \approx 6$ nm), SCFE shifts weighting toward the axial scans. In isotropic datasets (FIB25), weights are evenly split. This mechanism allows SCFE to flexibly adapt to arbitrary voxel resolutions, yielding robust performance across diverse datasets.

4. Integration with State-Space Models and Network Fusion

The 1D sequences produced by SCFE are passed through shared Visual-Mamba layers. These layers employ selective scan, state-space filtering, and a small feed-forward network with residual connections. This design enables patch-free global modeling. In the full MPFI architecture, SCFE’s global features ( $X_{\mathrm{global}}$ ) are fused with BDFE’s local features ( $X_{\mathrm{local}}$ ) within the Cross Feature Interaction (CFI) fusion stage; both feature sets are computed in parallel.

5. SCFE on Surfaces: Intrinsic Feature Extraction via Field Convolution

On curved surfaces, SCFE functionality is realized via field convolution (Mitchel et al., 2021). The operator produces continuous feature descriptors at every point $p$ of a 2D Riemannian manifold $M$ . Rather than gathering neighbor features into one local frame, every neighbor $q$ uses its own tangent-plane chart to “scatter” information towards $p$ . The process involves parallel transport, filter weighting using geodesic coordinates, and summation over the local geodesic ball $B_\varepsilon(p)$ . The field convolution is defined by:

$(X * f)(p) = \int_M \rho_q\, e^{i(\phi_q + \phi_{pq})}f\left(r_{qp} e^{i(\theta_{qp}-\phi_q)}\right)\, dq$

This approach—intrinsic and isometry-invariant—can be discretized on triangle meshes for practical implementation, using band-limited filter parameterizations and complex-valued descriptors.

6. Empirical Results and Hyperparameter Choices

Empirical testing of SCFE in NeuroMamba demonstrates quantitative gains in segmentation quality. Ablation studies indicate:

Adding SCFE to a baseline reduces Variation of Information (VI) from $0.895 \rightarrow 0.892$ and Adjusted Rand Index (ARAND) from $0.163 \rightarrow 0.158$
Combined SCFE+BDFE yields VI $0.865$ and ARAND $0.142$
Including anisotropy priors ( $\alpha, \beta$ ) reduces ARAND by $0.002 - 0.004$ on several datasets

Recommended hyperparameters are $\alpha=0.04, \beta=0.6$ ; typical block shapes are $[18,160,160]$ , while Mamba channels are $\approx 256$ . Computational overhead remains linear in $N=D \cdot H \cdot W$ ; full NeuroMamba (including SCFE) reports $\sim 2.1$ M parameters, $220.8$G FLOPS, and $0.059$s latency on a V100.

For field convolution, recommended parameters include support radius $\varepsilon=0.2$ (segmentation), radial bins $R=6$ –$12$, and band-limit $B=1$ –$2$. Implementation leverages area-normalized weights, complex ReLU, and backpropagation through sum-based convolutions. Downsampling and vector heat-based geodesic computations improve scalability on large meshes.

7. Practical Considerations and Reproducibility

SCFE modules avoid patch-based tokenization, relying instead on global flatten+reshape per block. Memory usage scales predictably with $D \cdot H \cdot W \cdot C$ and may be controlled via block size. Tiling with $50\%$ overlap, Adam optimizers ( $\mathrm{lr}=10^{-4}$ ), and accurate axis calibration ( $R_a, R_t$ ) are recommended for reproducibility (Jiang et al., 22 Jan 2026). For surfaces, efficient neighborhood search and tangent vector lifting are critical (Mitchel et al., 2021). The patch-free design and intrinsic convolutional structure ensure robust continuity under various geometric and resolution scenarios.

Table: SCFE Mechanisms in Volume vs. Surface Settings

Domain	SCFE Mechanism	Continuity Model
3D EM Volumes	Resolution-aware Mamba	Voxel global context
2D Surfaces	Field Convolution	Geodesic-intrinsic

SCFE modules are central to state-of-the-art architectures in both volumetric neuron segmentation and geometric surface learning, advancing continuity and adaptivity in neural feature modeling.

Markdown Report Issue Upgrade to Chat

References (2)

NeuroMamba: Multi-Perspective Feature Interaction with Visual Mamba for Neuron Segmentation (2026)

Field Convolutions for Surface CNNs (2021)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Spatial Continuous Feature Extractor (SCFE).