CloudMamba: SSM Architecture for Point Clouds

Updated 12 November 2025

CloudMamba is a state space model (SSM)-based architecture that serializes unordered 3D point clouds to enable enhanced geometric perception and long-range dependency modeling.
By chaining forward and backward SSM scans via the ChainedMamba block, it overcomes limitations of standard Bi-Mamba, providing richer context-aware feature aggregation.
Integrating the grouped selective state space model (GS6) allows for parameter-efficient processing that mitigates overfitting while maintaining linear computational complexity.

CloudMamba is a state space model (SSM)-based architecture designed for point cloud analysis, specifically addressing limitations in serialization of unordered data, geometric perception, and model overfitting common in vanilla Mamba-based pipelines. By introducing novel grouping, serialization, and scan-chaining methods—most notably the ChainedMamba block and grouped selective state space model (GS6)—CloudMamba achieves state-of-the-art performance on standard benchmarks with substantially improved long-range geometric modeling and efficiency (Qu et al., 11 Nov 2025).

1. Motivations and Background

Mamba and its selective state space model core (S6) have been adopted for point cloud tasks due to their linear complexity and capacity for long-range dependency modeling. However, directly applying vanilla Mamba networks to point clouds introduces critical shortcomings:

Imperfect point cloud serialization: Point clouds, inherently unordered, do not map naturally onto the 1D sequential format required for SSMs.
Insufficient high-level geometric perception: Standard bidirectional SSMs (bi-Mamba) provide limited geometric aggregation.
Overfitting in S6: The parameterization of the selective SSM in high-capacity settings can induce overfitting.

CloudMamba addresses these deficits through algorithmic innovations in serialization, feature merging, state-space scan chaining, and parameter sharing.

2. Sequence Expanding and Merging in Point Cloud Serialization

To accommodate the intrinsic permutation invariance of point clouds, CloudMamba employs sequence expanding and sequence merging mechanisms:

Sequence Expanding: Each 3D point cloud is serialized along each coordinate axis ( $x$ , $y$ , $z$ ) independently, generating three parallel ordered sequences per cloud. This preserves axis-based proximity information and imposes a pseudo-causal order suitable for SSM scans.
Sequence Merging: After SSMs process these axis-ordered sequences, CloudMamba merges the resultant high-order features from the three channels. Fusion is accomplished through parameter-free operations that causally integrate features, enabling the model to construct global, permutation-invariant representations from axis-local sequences.

This strategy ensures stable adaptation of unordered point sets to the causal inductive bias of the Mamba SSM without introducing spurious ordering artifacts or dependence on learnable serialization.

3. ChainedMamba: Chained Bi-Directional SSM Scans

A central innovation in CloudMamba is ChainedMamba, which redefines bidirectional SSM processing for point cloud analysis:

Standard Parallel Bi-Mamba: Processes a sequence in forward and backward directions independently, fusing outputs by summing or concatenation.
ChainedMamba: The forward SSM scan produces a sequence of high-order embeddings, which are then reversed and input as the "raw features" to the backward SSM scan. Thus, the backward pass reasons not over raw inputs but over contextually aggregated representations.

The state-space equations for ChainedMamba are as follows:

Forward pass for $t=1,\dots,L$ : $h^{\mathrm{f}}_t = \overline A^{\mathrm{f}}_t h^{\mathrm{f}}_{t-1} + \overline B^{\mathrm{f}}_t x_t,\quad \hat y^{\mathrm{f}}_t = \overline C^{\mathrm{f}}_t h^{\mathrm{f}}_t$ Backward pass on reversed $\hat Y^{\mathrm{f}}$ : $r_t = \hat y^{\mathrm{f}}_{L-t+1},\quad h^{\mathrm{b}}_t = \overline A^{\mathrm{b}}_t h^{\mathrm{b}}_{t-1} + \overline B^{\mathrm{b}}_t r_t,\quad \hat y^{\mathrm{b}}_t = \overline C^{\mathrm{b}}_t h^{\mathrm{b}}_t$ Output: $y_i = \hat y^{\mathrm{b}}_{L-i+1}$

By chaining in this way, ChainedMamba enables the backward SSM to mix and propagate information from contextually enhanced geometric features. For example, points scanned in $a<b<c<d$ order along an axis will have the backward pass operate on forward-aggregated representations, producing richer descriptors.

The following pseudocode summarizes the computation for one axis:

def ChainedMamba(X):         # X: L × D
    Hf = SSM_forward(X)      # Forward SSM scan, Hf[i] = hat_y^f_i
    R = reverse(Hf)          # Reverse the sequence
    Hb = SSM_backward(R)     # Backward SSM scan, Hb[t] = hat_y^b_t
    Y = reverse(Hb)          # Restore original order
    return Y

Notably, both SSM passes use either S6 or GS6 updates but do not share hidden state; each scan is freshly initialized.

4. Grouped Selective State Space Model (GS6)

To mitigate overfitting inherent to the selective state space model (S6), CloudMamba introduces GS6, which imposes parameter sharing strategies:

Grouped parameter sharing: Instead of fully independent S6 parameters at each position, groups of points share parameters, reducing effective capacity while preserving selective modeling. This approach maintains robustness against overfitting, especially in settings with limited labeled data or large model sizes.

The adoption of GS6 can be toggled as a drop-in replacement within the SSM backbone, retaining the linear complexity and causal computation characteristics of vanilla S6/Mamba.

5. Empirical Evaluation and Complexity

Empirical results on ModelNet40 reveal the benefits of CloudMamba’s architectural refinements:

Architecture	ModelNet40 OA (%)	Parameters Added by Chaining	Complexity
Parallel Bi-Mamba (GS6)	92.69	0	Linear in $L$
ChainedMamba (GS6: Drop-in)	93.65	0	Linear in $L$

A gain of +0.96 percentage points in overall accuracy is observed solely by switching from parallel to chained bi-directional SSMs, with identical model size and computational profile. This demonstrates that high-order forward feature feedback in the backward pass measurably improves global geometric perception and aggregation.

A further advantage is that CloudMamba’s operations remain parameter-efficient due to sequence-expanding/merging, GS6 sharing, and chaining strategies. The method scales to large point clouds due to its inherent linear complexity in sequence length.

6. Relation to Broader SSM and Mamba-Based Tasks

The ChainedMamba construction is directly analogous to the "chaining" of bi-directional SSM modules in temporal sequence modeling, as found in Mamba-based multi-object tracking systems for complex nonlinear motion (Xiao et al., 2024). In both spatial (point cloud) and temporal (MOT) modalities, chaining SSM modules enables:

Enhanced modeling of higher-order dependencies
Context-aware backward reasoning
Measurable gains in downstream performance metrics with no increase in parameter count

A plausible implication is that such scan-chaining architectures—integrating forward and backward context at higher levels—are generally applicable whenever the base domain exhibits nontrivial structure and order must be imposed for causality.

7. Practical Considerations and Limitations

CloudMamba is a drop-in replacement for SSM/Mamba backbones in point cloud pipelines, requiring no extra parameters or specialized serialization functions. When deploying ChainedMamba, memory usage is consistent with parallel bi-Mamba, and no additional regularization or auxiliary losses are necessary beyond standard procedures (e.g., Adam optimizer with weight decay).

Potential limitations include:

The method relies on axis-wise sorting for sequence expanding, which assumes some correlation between axes and relevant geometry. Pathological point distributions could, in principle, weaken pseudo-causal ordering.
While GS6 prevents overfitting in high-capacity models, aggressive grouping could restrict local specialization for highly heterogeneous clouds.

Despite these caveats, CloudMamba attains state-of-the-art performance across benchmarks with demonstrable quantitative gains resulting solely from chaining and grouping, validating its design for scalable, geometry-aware point cloud processing.

Markdown Report Issue Upgrade to Chat

References (2)

CloudMamba: Grouped Selective State Spaces for Point Cloud Analysis (2025)

MambaTrack: A Simple Baseline for Multiple Object Tracking with State Space Model (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to CloudMamba.