Contrastive Self-Supervised Learning As Neural Manifold Packing

Published 16 Jun 2025 in cs.LG, cs.AI, q-bio.NC, and stat.ML | (2506.13717v1)

Abstract: Contrastive self-supervised learning based on point-wise comparisons has been widely studied for vision tasks. In the visual cortex of the brain, neuronal responses to distinct stimulus classes are organized into geometric structures known as neural manifolds. Accurate classification of stimuli can be achieved by effectively separating these manifolds, akin to solving a packing problem. We introduce Contrastive Learning As Manifold Packing (CLAMP), a self-supervised framework that recasts representation learning as a manifold packing problem. CLAMP introduces a loss function inspired by the potential energy of short-range repulsive particle systems, such as those encountered in the physics of simple liquids and jammed packings. In this framework, each class consists of sub-manifolds embedding multiple augmented views of a single image. The sizes and positions of the sub-manifolds are dynamically optimized by following the gradient of a packing loss. This approach yields interpretable dynamics in the embedding space that parallel jamming physics, and introduces geometrically meaningful hyperparameters within the loss function. Under the standard linear evaluation protocol, which freezes the backbone and trains only a linear classifier, CLAMP achieves competitive performance with state-of-the-art self-supervised models. Furthermore, our analysis reveals that neural manifolds corresponding to different categories emerge naturally and are effectively separated in the learned representation space, highlighting the potential of CLAMP to bridge insights from physics, neural science, and machine learning.

Abstract PDF Upgrade to Chat

Summary

The paper presents CLAMP, which reformulates self-supervised learning as a neural manifold packing problem using a novel repulsive loss function.
The methodology employs ResNet architectures with augmentation-based sub-manifolds and physics-inspired dynamic adjustments to achieve optimal class separation.
CLAMP demonstrates competitive image classification performance in both linear and semi-supervised settings, aligning its embeddings with biological neural coding principles.

Contrastive Self-Supervised Learning as Neural Manifold Packing

The paper "Contrastive Self-Supervised Learning As Neural Manifold Packing" introduces a novel self-supervised learning framework named Contrastive Learning As Manifold Packing (CLAMP). CLAMP aims to recast the task of representation learning in neural networks as a manifold packing problem, inspired by the geometric arrangements of neural manifolds observed in the visual cortex.

Introduction and Background

Contrastive self-supervised learning (SSL) has advanced significantly in solving image representation challenges by leveraging pairwise embedding loss functions. Despite its progress in surpassing supervised learning methods, the geometric structure underlying SSL is underexplored. CLAMP addresses this by considering the embedding space as a collection of neural manifolds, seeking separability analogous to particle packing problems, where each class forms a distinct manifold.

The main contributions of the paper include the development of a new loss function based on repulsive particle systems, achieving competitive image classification accuracy, and drawing parallels between SSL dynamics and interacting particle systems.

Methodology

Manifold Packing Formulation

CLAMP formulates representation learning as an optimal packing of neural manifolds. A pooled set of augmented image views forms a sub-manifold, which is dynamically adjusted during training to reduce overlap while ensuring separation.

Figure 1: Sub-manifold and visualization of the embedding space.

Loss Function

The proposed loss function incorporates concepts from the physics of particle systems, specifically focusing on short-range repulsive potentials to dynamically adjust manifold sizes and optimize separation. This approach ensures similarity among augmentations and mitigates representational collapse.

Implementation Details

The framework utilizes ResNet architectures with MLP projection heads, employing standard practices such as batch normalization and distributed parallel training. Image augmentations are performed to increase variability, crucial for manifold packing.

Evaluation

Linear and Semi-Supervised Learning

CLAMP demonstrates competitive results in linear evaluation benchmarks against existing SSL methods, setting new performance records on specific datasets such as ImageNet-100.

Figure 2: Linear evaluation accuracy as the function of the size scale factor $r_s$ .

The model performs robustly under semi-supervised settings, validating its utility and effectiveness.

Training Dynamics and Visualization

The training dynamics reveal emergent patterns reminiscent of physical systems, with structured manifolds developing progressively to enhance class separability.

Properties of Representations

The representation space learned under CLAMP aligns closely with biological observations, such as eigenspectra following power-law decay, indicative of neural coding differentiability.

Figure 3: The properties of sub-manifolds in the embedding space for the pretrained ResNet-18 network.

Conclusion and Future Directions

In conclusion, CLAMP provides an efficient self-supervised learning solution by leveraging manifold packing principles, demonstrating competitive classification capabilities and aligning closely with biological neural representations. Future work should explore theoretical underpinnings and extend biological plausibility by investigating alternative learning rules and manifold orientation in embeddings. The integration of these insights may further enhance the efficacy and interpretability of neural network models.

The findings in CLAMP pave the way for cross-disciplinary research, bridging insights from physics, neuroscience, and machine learning, with implications for both understanding brain function and designing advanced AI systems.

Markdown Report Issue