PointMixup: Augmenting Point Clouds
- PointMixup is a data augmentation technique that optimally interpolates unordered point clouds using Earth Mover’s Distance for improved regularization.
- The method employs a mathematically principled assignment mechanism, often using the Hungarian algorithm or its approximations, to mix point sets effectively.
- Experimental results on datasets like ModelNet40 and ScanObjectNN demonstrate enhanced classification accuracy, increased robustness to perturbations, and better performance under data scarcity.
PointMixup is a data augmentation technique designed to address the challenge of interpolation-based regularization for unstructured point clouds. Whereas standard mixup paradigms leverage simple convex combinations in Euclidean spaces, PointMixup extends this notion to the domain of unordered point sets by introducing a mathematically principled interpolation and assignment mechanism under the Earth Mover’s Distance (EMD). This enables the creation of mixed training examples for point cloud classification models, achieving improved generalization and robustness, especially under data scarcity and geometric perturbations (Chen et al., 2020).
1. Mathematical Formalism of PointMixup
Let and denote two point clouds in , each of cardinality . The metric space is defined, where is the Earth Mover’s Distance:
A linear interpolation along the shortest path in is then defined:
Properties
- Shortest-path: for any .
- Assignment invariance: The same bijection aligns and .
- Linearity: .
These results guarantee that the path constructed is a true geodesic (in EMD) and that mixed samples interpolate between the original clouds with controlled distances.
2. Algorithmic Implementation
The PointMixup algorithm for a single mixup pair operates as follows:
1 2 3 4 5 6 |
Input: point clouds S1 = {x_i}, S2 = {y_j}, labels c1, c2, mix-parameter λ ∈ [0,1]
1. Compute φ* = argmin over permutations of Σ_i ||x_i - y_{φ(i)}||_2
(e.g., Hungarian algorithm, O(N³) or faster approximate solvers)
2. Form mixed cloud S_mix = { (1−λ)x_i + λ y_{φ*(i)} }_{i=1}^N
3. Form mixed label c_mix = (1−λ) c1 + λ c2
4. Return (S_mix, c_mix) |
The computational bottleneck is the optimal assignment via the Hungarian algorithm with complexity. In practice, approximate EMD solvers (such as Sinkhorn or auction algorithms) are adopted for efficiency.
3. Integration With Mixup and Manifold Mixup Paradigms
3.1 Input Mixup
PointMixup at input level is performed by randomly sampling two point clouds and a mixing parameter (from a Beta distribution). The interpolation constructs and as in the algorithm above. The standard training objective becomes
with denoting the point cloud network and the cross-entropy loss.
3.2 Manifold PointMixup
In manifold PointMixup, interpolation is applied in the latent feature space at a randomly-chosen network layer (including the input). The latent features are with geometric and feature components. The optimal assignment is computed using coordinates . Both coordinates and features are interpolated:
Training then proceeds forward from layer using these mixed representations.
4. Experimental Evaluation
4.1 Datasets and Architectures
Experiments are conducted on ModelNet40 (CAD: 9,843 train / 2,468 test, 40 classes; both pre-aligned and unaligned splits, with additional 20%-subset for scarce data) and ScanObjectNN (real-world OBJ_ONLY and PB_T50_RS variants) using PointNet, PointNet++, and DGCNN. Each input cloud consists of 1024 points (with jitter ), normalized to the unit sphere.
4.2 Training Setup
Models are trained with Adam (lr=1e-3, batch=16, 300 epochs). Mixup parameter is sampled on-the-fly from Beta with 0.4–2.0. Manifold mixup is applied at input and two set abstraction layers (for PN++).
4.3 Classification Results
Key performance metrics (PointNet++, ModelNet40):
| Split/Setting | w/o Mixup | PM Alone | PM+Manifold |
|---|---|---|---|
| Full-set, unaligned | 90.7% | 91.3% | 91.7% |
| Full-set, pre-aligned | 91.9% | - | 92.7% |
| 20% subset, pre-aligned | 86.1% | - | 88.6% |
On ScanObjectNN (standard), accuracy improves from 86.6% (baseline) to 88.5% (PM+manifold). For PointNet, accuracy rises from 89.2% to 89.9% with PM; for DGCNN, from 92.7% to 93.1% (PM+manifold).
4.4 Robustness and Data Efficiency
- On noise (Gaussian jitter, ), baseline achieves 35.1%, PM 51.5%, PM+manifold 56.5%.
- Under random rotations (30°), scaling (2), and 20% point drop, PM+manifold consistently outperforms other methods.
- In semi-supervised settings (ModelNet40, 400/600/800 labels), PM boosts accuracy from 69.4/72.6/73.5% (supervised) to 76.7/80.8/82.0%.
- Few-shot ProtoNet, 10 unseen classes: 1-shot 72.3% 77.2%, 5-shot 84.2% 85.9% with PM.
5. Benefits, Limitations, and Open Questions
Benefits
- Model-agnostic: Applicable to any point-based network architecture.
- Strong regularization: Demonstrated improvement in generalization, particularly when data is scarce.
- Robustness: Increased resilience to noise, geometric transformations (rotation, scaling), and missing data.
- Natural compatibility with semi-supervised and few-shot learning paradigms due to interpolation-based regularization.
Limitations
- Exact assignment (Hungarian algorithm) scales cubically in point count; large clouds require approximate EMD solvers.
- Point clouds must have equal cardinality (workarounds are possible but non-principled).
- For highly dissimilar shapes, interpolated samples may exhibit implausible geometry.
- Potential extensions leveraging local structural priors (e.g., normals, graph Laplacians) remain unexplored.
6. Context and Significance
PointMixup establishes the extension of mixup-style augmentations to unstructured, permutation-invariant point sets, addressing the absence of one-to-one point correspondences typical in Euclidean domains. Its formulation leverages an optimal transport perspective, imbuing the resultant interpolations with provable linearity and isometry in EMD. Empirical results indicate systematic gains across diverse point cloud learning scenarios—spanning input-level and manifold-level mixup, different network architectures, and challenging generalization regimes. This provides a foundation for mixup-style SSL and meta-learning directly in the point cloud domain (Chen et al., 2020).