Papers
Topics
Authors
Recent
Search
2000 character limit reached

Lie Group-Based Neural Networks

Updated 3 February 2026
  • Lie group-based neural networks are architectures that integrate continuous symmetries from Lie groups into layer design to ensure equivariance or invariance to geometric transformations.
  • They leverage Lie algebra structures, exponential mappings, and group convolution techniques to craft efficient models with reduced parameters and enhanced generalization on tasks like rotated-MNIST and CIFAR-10.
  • These networks have practical applications in vision, physics, control, and quantum systems by delivering predictable outputs under rotations, translations, and scaling without extensive data augmentation.

Lie group-based neural networks are a principled class of architectures in which the symmetries of a Lie group GG are built directly into network design, inducing equivariance or invariance to geometric transformations such as rotation, translation, scaling, or more general group actions. These methods leverage the smooth manifold and group structure of GG, as well as their associated Lie algebras, to define neural layers, kernels, and non-linearities with guaranteed analytic properties, resulting in models that are more data-efficient, generalize better across symmetry-related inputs, and provide robustness against group transformations. Their formal mathematical underpinnings, algorithmic constructions, and practical benefits span a wide range of domains, from vision and physics to generative modeling and control.

1. Mathematical Foundations and Core Definitions

A Lie group GG is both a smooth nn-dimensional manifold and a group, such that the group operation (g1,g2)g1g2(g_1,g_2)\mapsto g_1g_2 and inversion gg1g\mapsto g^{-1} are smooth maps. Common examples are (Rn,+)(\mathbb{R}^n,+), the rotation group SO(n)SO(n), special Euclidean SE(n)=RnSO(n)SE(n)=\mathbb{R}^n\rtimes SO(n), and similarity group Sim(n)Sim(n) (Smets, 2024).

The Lie algebra g=TeG\mathfrak{g}=T_eG is the tangent space at the identity, equipped with a Lie bracket [X,Y][X,Y] capturing the group’s infinitesimal structure. The exponential map exp:gG\exp:\mathfrak{g}\to G and its inverse log:Gg\log:G\to\mathfrak{g} connect the local linear structure of g\mathfrak{g} to the global non-abelian geometry of GG. Homogeneous spaces M=G/HM=G/H capture coset spaces on which GG acts transitively.

A group action ρ:G×MM\rho: G\times M\to M satisfies ρ(e,p)=p\rho(e,p)=p and ρ(g2,ρ(g1,p))=ρ(g2g1,p)\rho(g_2,\rho(g_1,p)) = \rho(g_2g_1,p). For functions f:MRf:M\to\mathbb{R}, the induced action is (gf)(p)=f(g1p)(g\cdot f)(p) = f(g^{-1}\cdot p). A map F:MNF:M\to N is GG-equivariant if F(gp)=gF(p)F(g\cdot p) = g\cdot F(p) for all gGg\in G (Smets, 2024).

2. Equivariant Architectures and Group Convolutions

2.1 Group Convolution

On GG itself, the canonical left-Haar measure dgdg is invariant, and the group convolution or correlation for f,k:GRf,k:G\to\mathbb{R} is

(kGf)(h)=Gk(g1h)f(g)dg.(k\star_G f)(h) = \int_G k(g^{-1}h)\,f(g)\,dg.

For homogeneous spaces M=G/HM=G/H, a lift-and-project approach is used, integrating over GG-covariant measures (Smets, 2024). In practice, these integrals are performed via discrete or sampled group elements.

2.2 Layer Design: Lifting, Equivariant Layers, Pooling

Modern Lie group-based architectures operate via:

  • Lifting layers: mapping functions on MM to GG by convolving with learnable "mother" kernels, e.g. (f(1))(g)=Mκ(g1x)f(x)dx(f^{(1)})(g) = \int_M \kappa(g^{-1}\cdot x)f(x)dx.
  • Equivariant group-convolutional layers: stacking group convolutions with pointwise nonlinearities (e.g., ReLU, which commutes with group action), channel mixing, and weight-sharing enforced across group elements.
  • Projection or pooling: integrating out group variables, typically via max or average pooling along HH-fibers to revert to MM while preserving equivariance (Smets, 2024).

2.3 Representative Examples

  • Spherical CNNs: G=SO(3)G=SO(3) acting on S2S^2, implemented via spherical harmonics, FFTs, and Wigner D-matrix transforms.
  • SE(2)SE(2) and Sim(2)Sim(2)-equivariant models: stacking features over spatial, orientation, and scale dimensions for roto-translation and scale invariance (Smets, 2024, Qiao et al., 2023, Knigge et al., 2021).

3. Kernel Parameterizations, Weight-Tying, and Separability

3.1 Parametrizing Convolution Kernels

Bespoke parameterizations are central for computational tractability on continuous groups:

Empirical studies show that most classical G-CNNs learn redundant filters across subgroup indices; spatial patterns in e.g. SE(2)SE(2) group kernels are highly correlated. Separable group convolution kernels factorize as k(x,h)=kR2(x)kH(h)k(x,h)=k_{\mathbb{R}^2}(x)k_H(h), enabling dramatic reductions in parameter count and inference time (Knigge et al., 2021).

3.2 Channel Mixing and Group Representation Theory

Construction of equivariant layers leverages:

  • Channel mixing via 1×11\times 1 convolutions or linear layers.
  • For general Lie groups, explicit representations (found via the FindRep algorithm) and Clebsch-Gordan coefficients enable construction of tensor-product nonlinearities and precise equivariant mappings (Shutty et al., 2020).

4. Benefits and Empirical Advantages

Parameter Efficiency

Lie group weight-tying—one kernel per group—leads to reductions of parameters by factors proportional to group order or volume. For larger symmetry groups (e.g., Sim(2)Sim(2)), the difference can be orders of magnitude (Smets, 2024, Knigge et al., 2021).

Generalization and Stability

Imposing exact equivariance constrains the hypothesis space, reducing VC dimension and covering numbers. This tightens generalization bounds and empirically improves out-of-distribution robustness, evidenced in tasks such as rotated-MNIST, medical image analysis, and 3D object classification (Smets, 2024, Knigge et al., 2021).

Test-time augmentation is rendered unnecessary: the network's outputs transform predictably under group actions by construction, boosting stability and interpretability.

Performance Benchmarks

On rotation- and scale-augmented datasets (rotated-MNIST, Galaxy10, blood cell images), Lie group-based CNNs balanced both data efficiency and accuracy, often achieving SOTA or nearly SOTA results with strict parameter discipline (Qiao et al., 2023, Knigge et al., 2021). For example:

  • On blood cell images, Sim(2)Sim(2)-Lie G-CNN achieved 97.5% accuracy vs. 93.7% (dilated convs) (Qiao et al., 2023).
  • For rotated CIFAR-10, SO(2)SO(2)-Lie G-CNN attained 99.9% accuracy (Qiao et al., 2023).

5. Algorithmic and Practical Considerations

Algorithmic Steps—SE(2)SE(2)-CNN example (Smets, 2024):

  1. Fix GG and discretize (e.g., grid in translation, rotations by EE angles).
  2. Lifting: raster rotate planar kernels into EE group elements.
  3. Build equivariant group conv layers: 1×11\times1 in-channel mixing, 3×3×E3\times3\times E group convolution.
  4. Insert G-equivariant nonlinearities (pointwise ReLU), orientation-pooling, as necessary.
  5. Projection: pool/integrate over HH-orbit (e.g., max over orientations).
  6. Weight initialization as in standard CNNs (Glorot, He).
  7. SGD/Adam for training. Codebases: E2CNN, LieConv.

Sampling and Approximation

  • Non-compact or large groups require discretization and truncation for tractability.
  • Random sampling over compact subgroups yields unbiased equivariance and improved empirical results.
  • Block-diagonal or localized kernels restrict support, reducing compute and overfitting (Bekkers, 2019, Knigge et al., 2021).

6. Extensions: Nonlinear Activations, Spectral Methods, and Higher Structures

Nonlinearities

Pointwise activations (e.g., ReLU) are GG-equivariant. To achieve richer nonlinearity in representation-valued features, tensor-product nonlinearities with Clebsch-Gordan projections are used, ensuring equivariance across all layers (Shutty et al., 2020).

Spectral and Harmonic Methods

For groups such as SO(2)SO(2) and SO(3)SO(3), fast transforms (FFT, Wigner D-matrix) allow spectral convolution and efficient computation of group Fourier integrals (Smets, 2024).

Transformers and Attention

Group-equivariant self-attention (LieTransformer) generalizes convolutional equivariance, lifting inputs to G/HG/H, applying attention on GG, and stacking multiple equivariant layers before pooling. Explicit proof of equivariance was given for general unimodular Lie groups (Hutchinson et al., 2020).

7. Applications Beyond Vision: Control, Physics, Quantum, and Representation Learning

Control and Robotics

Tracking control on matrix Lie groups (e.g., SE(3)SE(3) for rigid-body formation) leverages left-invariance in error signals, avoiding coordinate singularities and maintaining global search in network parameters. Robust stability is proven via Lyapunov techniques (Chhabra et al., 7 May 2025).

Quantum Circuits

Lie group dual representations (SU(2n)SU(2^n) and su(2n)\mathfrak{su}(2^n)) facilitate geometry-aware parameter pruning in quantum neural networks, yielding up to 10×10\times compression with provable error bounds (Shao et al., 10 Dec 2025).

Self-supervised and Generative Models

Manifold-based contrastive learning injects Lie group-generated feature augmentations, parametrized via global generator matrices and learned sparsity-promoting coefficients, providing geometric structure to contrastive SSL and semi-supervised learning (Fallah et al., 2023). Generative auto-encoders use exponential mapping layers on group manifolds (e.g., UTDAT for Gaussian distributions) (Gong et al., 2019).

Scientific Computing and Symmetry-informed PINNs

Physics-Informed Neural Networks (PINNs) incorporating Lie symmetry generators in loss function construction yield order-of-magnitude reductions in error and improved generalization for PDEs possessing continuous symmetries (Shah et al., 30 Sep 2025).


Summary Table: Key Features in Lie Group-Based Neural Network Design

Component Description Citation
Group convolution Generalizes translation equivariance to GG-equivariance via group integral (Smets, 2024)
Weight-tying All locations/orientations share a single group kernel (Smets, 2024)
B-spline/Lie-algebra kernel Flexible basis expansion and parameterization of continuous kernels via exp/log\exp/\log (Bekkers, 2019)
Lifting/Projection Lifting input to GG or G/HG/H, then projecting back via HH-invariant pooling (Smets, 2024)
Separable kernels Factorization k(x,h)=kR2(x)kH(h)k(x,h) = k_{\mathbb{R}^2}(x)k_H(h) to reduce parameter and compute costs (Knigge et al., 2021)
Nonlinearities Pointwise ReLU or Clebsch-Gordan/tensor-product based for representation-valued features (Shutty et al., 2020)
Self-attention (LieTransformer) Group-equivariant transformer leveraging lifted inputs (Hutchinson et al., 2020)

References

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Lie Group-Based Neural Networks.