PointMixup: Augmenting Point Clouds

Updated 9 February 2026

PointMixup is a data augmentation technique that optimally interpolates unordered point clouds using Earth Mover’s Distance for improved regularization.
The method employs a mathematically principled assignment mechanism, often using the Hungarian algorithm or its approximations, to mix point sets effectively.
Experimental results on datasets like ModelNet40 and ScanObjectNN demonstrate enhanced classification accuracy, increased robustness to perturbations, and better performance under data scarcity.

PointMixup is a data augmentation technique designed to address the challenge of interpolation-based regularization for unstructured point clouds. Whereas standard mixup paradigms leverage simple convex combinations in Euclidean spaces, PointMixup extends this notion to the domain of unordered point sets by introducing a mathematically principled interpolation and assignment mechanism under the Earth Mover’s Distance (EMD). This enables the creation of mixed training examples for point cloud classification models, achieving improved generalization and robustness, especially under data scarcity and geometric perturbations (Chen et al., 2020).

1. Mathematical Formalism of PointMixup

Let $S_1 = \{x_i\}_{i=1}^N$ and $S_2 = \{y_j\}_{j=1}^N$ denote two point clouds in $\mathbb{R}^3$ , each of cardinality $N$ . The metric space $(\mathcal{S}, d_{\mathrm{EMD}})$ is defined, where $d_{\mathrm{EMD}}$ is the Earth Mover’s Distance: $\varphi^* = \arg\min_{\varphi \in \text{Perm}(N)} \sum_{i=1}^N \|x_i - y_{\varphi(i)}\|_2,$

$d_{\mathrm{EMD}}(S_1, S_2) = \frac{1}{N} \sum_{i=1}^N \|x_i - y_{\varphi^*(i)}\|_2.$

A linear interpolation along the shortest path in $(\mathcal{S}, d_{\mathrm{EMD}})$ is then defined: $\mathcal{F}^*_{S_1 \to S_2}(\lambda) = S^{(\lambda)} = \{u_i\}_{i=1}^N, \quad u_i = (1-\lambda)x_i + \lambda y_{\varphi^*(i)},\quad \lambda\in[0,1].$

Properties

Shortest-path: $d_{\mathrm{EMD}}(S_1, S^{(\lambda)}) + d_{\mathrm{EMD}}(S^{(\lambda)}, S_2) = d_{\mathrm{EMD}}(S_1, S_2)$ for any $\lambda\in[0,1]$ .
Assignment invariance: The same bijection $\varphi^*$ aligns $S_1 \leftrightarrow S^{(\lambda)}$ and $S^{(\lambda_1)} \leftrightarrow S^{(\lambda_2)}$ .
Linearity: $d_{\mathrm{EMD}}(S^{(\lambda_1)}, S^{(\lambda_2)}) = |\lambda_2 - \lambda_1| d_{\mathrm{EMD}}(S_1, S_2)$ .

These results guarantee that the path constructed is a true geodesic (in EMD) and that mixed samples interpolate between the original clouds with controlled distances.

2. Algorithmic Implementation

The PointMixup algorithm for a single mixup pair operates as follows:

Input: point clouds S1 = {x_i}, S2 = {y_j}, labels c1, c2, mix-parameter λ ∈ [0,1]
1. Compute φ* = argmin over permutations of Σ_i ||x_i - y_{φ(i)}||_2
   (e.g., Hungarian algorithm, O(N³) or faster approximate solvers)
2. Form mixed cloud S_mix = { (1−λ)x_i + λ y_{φ*(i)} }_{i=1}^N
3. Form mixed label c_mix = (1−λ) c1 + λ c2
4. Return (S_mix, c_mix)

The computational bottleneck is the optimal assignment via the Hungarian algorithm with $O(N^3)$ complexity. In practice, approximate EMD solvers (such as Sinkhorn or auction algorithms) are adopted for efficiency.

3. Integration With Mixup and Manifold Mixup Paradigms

3.1 Input Mixup

PointMixup at input level is performed by randomly sampling two point clouds and a mixing parameter $\lambda$ (from a Beta $(\gamma,\gamma)$ distribution). The interpolation constructs $S_{mix}$ and $c_{mix}$ as in the algorithm above. The standard training objective becomes

$L_\text{mixup} = \mathbb{E}_{(S_1,c_1),(S_2,c_2),\lambda} \Big[ \ell( h_\theta(S_{mix}), c_{mix} ) \Big],$

with $h_\theta$ denoting the point cloud network and $\ell$ the cross-entropy loss.

3.2 Manifold PointMixup

In manifold PointMixup, interpolation is applied in the latent feature space at a randomly-chosen network layer $\ell$ (including the input). The latent features are $Z_{(\ell)} = \{ (x_i, z_i^{(\ell)}) \}_{i=1}^N$ with geometric and feature components. The optimal assignment $\varphi^*$ is computed using coordinates $\{x_i\}$ . Both coordinates and features are interpolated: $x_i^\text{mix} = (1-\lambda)x_i + \lambda y_{\varphi^*(i)},$

$z_i^\text{mix} = (1-\lambda)z_i^{(\ell)} + \lambda w_{\varphi^*(i)}^{(\ell)}.$

Training then proceeds forward from layer $\ell$ using these mixed representations.

4. Experimental Evaluation

4.1 Datasets and Architectures

Experiments are conducted on ModelNet40 (CAD: 9,843 train / 2,468 test, 40 classes; both pre-aligned and unaligned splits, with additional 20%-subset for scarce data) and ScanObjectNN (real-world OBJ_ONLY and PB_T50_RS variants) using PointNet, PointNet++, and DGCNN. Each input cloud consists of 1024 points (with jitter $\sigma=0.02$ ), normalized to the unit sphere.

4.2 Training Setup

Models are trained with Adam (lr=1e-3, batch=16, 300 epochs). Mixup parameter $\lambda$ is sampled on-the-fly from Beta $(\gamma,\gamma)$ with $\gamma\approx$ 0.4–2.0. Manifold mixup is applied at input and two set abstraction layers (for PN++).

4.3 Classification Results

Key performance metrics (PointNet++, ModelNet40):

Split/Setting	w/o Mixup	PM Alone	PM+Manifold
Full-set, unaligned	90.7%	91.3%	91.7%
Full-set, pre-aligned	91.9%	-	92.7%
20% subset, pre-aligned	86.1%	-	88.6%

On ScanObjectNN (standard), accuracy improves from 86.6% (baseline) to 88.5% (PM+manifold). For PointNet, accuracy rises from 89.2% to 89.9% with PM; for DGCNN, from 92.7% to 93.1% (PM+manifold).

4.4 Robustness and Data Efficiency

On noise (Gaussian jitter, $\sigma^2=0.05$ ), baseline achieves 35.1%, PM 51.5%, PM+manifold 56.5%.
Under random rotations ( $\pm$ 30°), scaling ( $\times$ 2), and 20% point drop, PM+manifold consistently outperforms other methods.
In semi-supervised settings (ModelNet40, 400/600/800 labels), PM boosts accuracy from 69.4/72.6/73.5% (supervised) to 76.7/80.8/82.0%.
Few-shot ProtoNet, 10 unseen classes: 1-shot 72.3% $\rightarrow$ 77.2%, 5-shot 84.2% $\rightarrow$ 85.9% with PM.

5. Benefits, Limitations, and Open Questions

Benefits

Model-agnostic: Applicable to any point-based network architecture.
Strong regularization: Demonstrated improvement in generalization, particularly when data is scarce.
Robustness: Increased resilience to noise, geometric transformations (rotation, scaling), and missing data.
Natural compatibility with semi-supervised and few-shot learning paradigms due to interpolation-based regularization.

Limitations

Exact assignment (Hungarian algorithm) scales cubically in point count; large clouds require approximate EMD solvers.
Point clouds must have equal cardinality (workarounds are possible but non-principled).
For highly dissimilar shapes, interpolated samples may exhibit implausible geometry.
Potential extensions leveraging local structural priors (e.g., normals, graph Laplacians) remain unexplored.

6. Context and Significance

PointMixup establishes the extension of mixup-style augmentations to unstructured, permutation-invariant point sets, addressing the absence of one-to-one point correspondences typical in Euclidean domains. Its formulation leverages an optimal transport perspective, imbuing the resultant interpolations with provable linearity and isometry in EMD. Empirical results indicate systematic gains across diverse point cloud learning scenarios—spanning input-level and manifold-level mixup, different network architectures, and challenging generalization regimes. This provides a foundation for mixup-style SSL and meta-learning directly in the point cloud domain (Chen et al., 2020).

Markdown Report Issue Upgrade to Chat

References (1)

PointMixup: Augmentation for Point Clouds (2020)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to PointMixup.