Lie Group-Based Neural Networks

Updated 3 February 2026

Lie group-based neural networks are architectures that integrate continuous symmetries from Lie groups into layer design to ensure equivariance or invariance to geometric transformations.
They leverage Lie algebra structures, exponential mappings, and group convolution techniques to craft efficient models with reduced parameters and enhanced generalization on tasks like rotated-MNIST and CIFAR-10.
These networks have practical applications in vision, physics, control, and quantum systems by delivering predictable outputs under rotations, translations, and scaling without extensive data augmentation.

Lie group-based neural networks are a principled class of architectures in which the symmetries of a Lie group $G$ are built directly into network design, inducing equivariance or invariance to geometric transformations such as rotation, translation, scaling, or more general group actions. These methods leverage the smooth manifold and group structure of $G$ , as well as their associated Lie algebras, to define neural layers, kernels, and non-linearities with guaranteed analytic properties, resulting in models that are more data-efficient, generalize better across symmetry-related inputs, and provide robustness against group transformations. Their formal mathematical underpinnings, algorithmic constructions, and practical benefits span a wide range of domains, from vision and physics to generative modeling and control.

1. Mathematical Foundations and Core Definitions

A Lie group $G$ is both a smooth $n$ -dimensional manifold and a group, such that the group operation $(g_1,g_2)\mapsto g_1g_2$ and inversion $g\mapsto g^{-1}$ are smooth maps. Common examples are $(\mathbb{R}^n,+)$ , the rotation group $SO(n)$ , special Euclidean $SE(n)=\mathbb{R}^n\rtimes SO(n)$ , and similarity group $Sim(n)$ (Smets, 2024).

The Lie algebra $\mathfrak{g}=T_eG$ is the tangent space at the identity, equipped with a Lie bracket $[X,Y]$ capturing the group’s infinitesimal structure. The exponential map $\exp:\mathfrak{g}\to G$ and its inverse $\log:G\to\mathfrak{g}$ connect the local linear structure of $\mathfrak{g}$ to the global non-abelian geometry of $G$ . Homogeneous spaces $M=G/H$ capture coset spaces on which $G$ acts transitively.

A group action $\rho: G\times M\to M$ satisfies $\rho(e,p)=p$ and $\rho(g_2,\rho(g_1,p)) = \rho(g_2g_1,p)$ . For functions $f:M\to\mathbb{R}$ , the induced action is $(g\cdot f)(p) = f(g^{-1}\cdot p)$ . A map $F:M\to N$ is $G$ -equivariant if $F(g\cdot p) = g\cdot F(p)$ for all $g\in G$ (Smets, 2024).

2. Equivariant Architectures and Group Convolutions

2.1 Group Convolution

On $G$ itself, the canonical left-Haar measure $dg$ is invariant, and the group convolution or correlation for $f,k:G\to\mathbb{R}$ is

$(k\star_G f)(h) = \int_G k(g^{-1}h)\,f(g)\,dg.$

For homogeneous spaces $M=G/H$ , a lift-and-project approach is used, integrating over $G$ -covariant measures (Smets, 2024). In practice, these integrals are performed via discrete or sampled group elements.

2.2 Layer Design: Lifting, Equivariant Layers, Pooling

Modern Lie group-based architectures operate via:

Lifting layers: mapping functions on $M$ to $G$ by convolving with learnable "mother" kernels, e.g. $(f^{(1)})(g) = \int_M \kappa(g^{-1}\cdot x)f(x)dx$ .
Equivariant group-convolutional layers: stacking group convolutions with pointwise nonlinearities (e.g., ReLU, which commutes with group action), channel mixing, and weight-sharing enforced across group elements.
Projection or pooling: integrating out group variables, typically via max or average pooling along $H$ -fibers to revert to $M$ while preserving equivariance (Smets, 2024).

2.3 Representative Examples

Spherical CNNs: $G=SO(3)$ acting on $S^2$ , implemented via spherical harmonics, FFTs, and Wigner D-matrix transforms.
$SE(2)$ and $Sim(2)$ -equivariant models: stacking features over spatial, orientation, and scale dimensions for roto-translation and scale invariance (Smets, 2024, Qiao et al., 2023, Knigge et al., 2021).

3. Kernel Parameterizations, Weight-Tying, and Separability

3.1 Parametrizing Convolution Kernels

Bespoke parameterizations are central for computational tractability on continuous groups:

Kernels can be expanded on B-spline or other basis functions in the Lie algebra, using $\exp$ and $\log$ for sampling and interpolation (Bekkers, 2019).
Sinusoidal Representation Networks (SIRENs) input Lie algebra coordinates to produce smooth, band-limited kernels tunable in frequency (Knigge et al., 2021).

Empirical studies show that most classical G-CNNs learn redundant filters across subgroup indices; spatial patterns in e.g. $SE(2)$ group kernels are highly correlated. Separable group convolution kernels factorize as $k(x,h)=k_{\mathbb{R}^2}(x)k_H(h)$ , enabling dramatic reductions in parameter count and inference time (Knigge et al., 2021).

3.2 Channel Mixing and Group Representation Theory

Construction of equivariant layers leverages:

Channel mixing via $1\times 1$ convolutions or linear layers.
For general Lie groups, explicit representations (found via the FindRep algorithm) and Clebsch-Gordan coefficients enable construction of tensor-product nonlinearities and precise equivariant mappings (Shutty et al., 2020).

4. Benefits and Empirical Advantages

Parameter Efficiency

Lie group weight-tying—one kernel per group—leads to reductions of parameters by factors proportional to group order or volume. For larger symmetry groups (e.g., $Sim(2)$ ), the difference can be orders of magnitude (Smets, 2024, Knigge et al., 2021).

Generalization and Stability

Imposing exact equivariance constrains the hypothesis space, reducing VC dimension and covering numbers. This tightens generalization bounds and empirically improves out-of-distribution robustness, evidenced in tasks such as rotated-MNIST, medical image analysis, and 3D object classification (Smets, 2024, Knigge et al., 2021).

Test-time augmentation is rendered unnecessary: the network's outputs transform predictably under group actions by construction, boosting stability and interpretability.

Performance Benchmarks

On rotation- and scale-augmented datasets (rotated-MNIST, Galaxy10, blood cell images), Lie group-based CNNs balanced both data efficiency and accuracy, often achieving SOTA or nearly SOTA results with strict parameter discipline (Qiao et al., 2023, Knigge et al., 2021). For example:

On blood cell images, $Sim(2)$ -Lie G-CNN achieved 97.5% accuracy vs. 93.7% (dilated convs) (Qiao et al., 2023).
For rotated CIFAR-10, $SO(2)$ -Lie G-CNN attained 99.9% accuracy (Qiao et al., 2023).

5. Algorithmic and Practical Considerations

Algorithmic Steps— $SE(2)$ -CNN example (Smets, 2024):

Fix $G$ and discretize (e.g., grid in translation, rotations by $E$ angles).
Lifting: raster rotate planar kernels into $E$ group elements.
Build equivariant group conv layers: $1\times1$ in-channel mixing, $3\times3\times E$ group convolution.
Insert G-equivariant nonlinearities (pointwise ReLU), orientation-pooling, as necessary.
Projection: pool/integrate over $H$ -orbit (e.g., max over orientations).
Weight initialization as in standard CNNs (Glorot, He).
SGD/Adam for training. Codebases: E2CNN, LieConv.

Sampling and Approximation

Non-compact or large groups require discretization and truncation for tractability.
Random sampling over compact subgroups yields unbiased equivariance and improved empirical results.
Block-diagonal or localized kernels restrict support, reducing compute and overfitting (Bekkers, 2019, Knigge et al., 2021).

6. Extensions: Nonlinear Activations, Spectral Methods, and Higher Structures

Nonlinearities

Pointwise activations (e.g., ReLU) are $G$ -equivariant. To achieve richer nonlinearity in representation-valued features, tensor-product nonlinearities with Clebsch-Gordan projections are used, ensuring equivariance across all layers (Shutty et al., 2020).

Spectral and Harmonic Methods

For groups such as $SO(2)$ and $SO(3)$ , fast transforms (FFT, Wigner D-matrix) allow spectral convolution and efficient computation of group Fourier integrals (Smets, 2024).

Transformers and Attention

Group-equivariant self-attention (LieTransformer) generalizes convolutional equivariance, lifting inputs to $G/H$ , applying attention on $G$ , and stacking multiple equivariant layers before pooling. Explicit proof of equivariance was given for general unimodular Lie groups (Hutchinson et al., 2020).

7. Applications Beyond Vision: Control, Physics, Quantum, and Representation Learning

Control and Robotics

Tracking control on matrix Lie groups (e.g., $SE(3)$ for rigid-body formation) leverages left-invariance in error signals, avoiding coordinate singularities and maintaining global search in network parameters. Robust stability is proven via Lyapunov techniques (Chhabra et al., 7 May 2025).

Quantum Circuits

Lie group dual representations ( $SU(2^n)$ and $\mathfrak{su}(2^n)$ ) facilitate geometry-aware parameter pruning in quantum neural networks, yielding up to $10\times$ compression with provable error bounds (Shao et al., 10 Dec 2025).

Self-supervised and Generative Models

Manifold-based contrastive learning injects Lie group-generated feature augmentations, parametrized via global generator matrices and learned sparsity-promoting coefficients, providing geometric structure to contrastive SSL and semi-supervised learning (Fallah et al., 2023). Generative auto-encoders use exponential mapping layers on group manifolds (e.g., UTDAT for Gaussian distributions) (Gong et al., 2019).

Scientific Computing and Symmetry-informed PINNs

Physics-Informed Neural Networks (PINNs) incorporating Lie symmetry generators in loss function construction yield order-of-magnitude reductions in error and improved generalization for PDEs possessing continuous symmetries (Shah et al., 30 Sep 2025).

Summary Table: Key Features in Lie Group-Based Neural Network Design

Component	Description	Citation
Group convolution	Generalizes translation equivariance to $G$ -equivariance via group integral	(Smets, 2024)
Weight-tying	All locations/orientations share a single group kernel	(Smets, 2024)
B-spline/Lie-algebra kernel	Flexible basis expansion and parameterization of continuous kernels via $\exp/\log$	(Bekkers, 2019)
Lifting/Projection	Lifting input to $G$ or $G/H$ , then projecting back via $H$ -invariant pooling	(Smets, 2024)
Separable kernels	Factorization $k(x,h) = k_{\mathbb{R}^2}(x)k_H(h)$ to reduce parameter and compute costs	(Knigge et al., 2021)
Nonlinearities	Pointwise ReLU or Clebsch-Gordan/tensor-product based for representation-valued features	(Shutty et al., 2020)
Self-attention (LieTransformer)	Group-equivariant transformer leveraging lifted inputs	(Hutchinson et al., 2020)

References

"Mathematics of Neural Networks (Lecture Notes Graduate Course)" (Smets, 2024)
"Exploiting Redundancy: Separable Group Convolutional Networks on Lie Groups" (Knigge et al., 2021)
"B-Spline CNNs on Lie Groups" (Bekkers, 2019)
"Scale-Rotation-Equivariant Lie Group Convolution Neural Networks (Lie Group-CNNs)" (Qiao et al., 2023)
"Computing Representations for Lie Algebraic Networks" (Shutty et al., 2020)
"Lie Neurons: Adjoint-Equivariant Neural Networks for Semisimple Lie Algebras" (Lin et al., 2023)
"LieTransformer: Equivariant self-attention for Lie Groups" (Hutchinson et al., 2020)
"A Lie Group Approach to Riemannian Batch Normalization" (Chen et al., 2024)
"Enhancing PINN Performance Through Lie Symmetry Group" (Shah et al., 30 Sep 2025)
"LiePrune: Lie Group and Quantum Geometric Dual Representation for One-Shot Structured Pruning of Quantum Neural Networks" (Shao et al., 10 Dec 2025)
"Manifold Contrastive Learning with Variational Lie Group Operators" (Fallah et al., 2023)
"Lie Group Auto-Encoder" (Gong et al., 2019)
"Geometric Fault-Tolerant Neural Network Tracking Control of Unknown Systems on Matrix Lie Groups" (Chhabra et al., 7 May 2025)