Papers
Topics
Authors
Recent
Search
2000 character limit reached

Symplectic Convolutional Modules

Updated 8 February 2026
  • Symplectic convolutional modules are convolutional layers that preserve canonical symplectic structure, ensuring accurate modeling of Hamiltonian dynamics.
  • They integrate tensor techniques and structured Toeplitz matrices with novel parameterization strategies to build scalable, structure-preserving autoencoders.
  • Empirical studies on wave, nonlinear Schrödinger, and sine-Gordon systems show that these modules significantly outperform linear PSD-based reductions.

Symplectic convolutional modules are architectural primitives that combine convolutional operations with symplectic structure preservation, specifically formulated for neural networks modeling Hamiltonian dynamics or conservative systems. These modules underpin the construction of symplectic convolutional neural networks (CNNs), wherein each convolutional and pooling layer is explicitly designed to respect the canonical symplectic form, thereby ensuring structure-preserving evolution throughout the network. The methodology integrates tensor techniques, proper symplectic decomposition, and novel parameterization strategies to yield efficient, scalable, and mathematically rigorous neural models for high-dimensional, Hamiltonian, and wave-like partial differential equations (Yıldız et al., 27 Aug 2025).

1. Mathematical Foundation of Symplectic Convolutional Layers

A fundamental goal is to recast standard discrete convolutions as linear symplectic maps. For a 1D multi-channel input xRCin×Nx \in \mathbb{R}^{C_{\rm in} \times N}—flattened as xRCinN\vec{x} \in \mathbb{R}^{C_{\rm in} N}—the convolutional output yRCout×Ny \in \mathbb{R}^{C_{\rm out} \times N} can be written as

y=Wx\vec{y} = W \vec{x}

where WW is a block matrix constructed from (zero-padded) Toeplitz matrices Ti,jT_{i,j}. The symplecticity constraint for a real 2n×2n2n \times 2n matrix MM imposes MTJM=JM^T J M = J, with J=[0In In0]J = \begin{bmatrix}0 & I_n \ -I_n & 0\end{bmatrix}.

A sufficient parametric form for a symplectic WW (with Cin=Cout=2C_{\rm in}=C_{\rm out}=2) is

W=[INT12 0IN]W = \begin{bmatrix} I_N & T_{12}\ 0 & I_N \end{bmatrix}

with T12T_{12} symmetric, guaranteeing WTJ2NW=J2NW^T J_{2N} W = J_{2N}. Stacking such blocks enables arbitrary (even non-square) channel manipulations while preserving symplecticity.

Families of structured matrices are introduced for generalization:

  • T(N)T(N): all N×NN \times N Toeplitz matrices,
  • Tsym(N)T_{\rm sym}(N): symmetric Toeplitz,
  • T(b,N)T(b,N): b×bb \times b block-Toeplitz with N×NN \times N blocks,
  • Tsym1D(b,N)T_{\rm sym}^{\rm 1D}(b,N): block-symmetric,
  • Analogous constructions for multi-dimensional cases.

Symplectic convolutional lifting matrices are defined (Def. 7) to provide a formal foundation for the encoder and decoder blocks.

2. Parameterization via Symplectic Neural Networks (SympNets)

The SympNet approach composes deep stacks of simple symplectic maps, extending beyond simple constrained convolutional blocks. The linear–activation SympNet (LA-SympNet) alternates:

  • Linear modules: Employ symmetric Toeplitz (or block-Toeplitz) convolutional operators,
  • Activation modules: Use potential-based pointwise nonlinearities.

Linear modules take the form

Lup:(q,p)i=1m[I0 SiI](q,p),Si=SiT\mathcal{L}^{\rm up}:(q,p) \mapsto \prod_{i=1}^m \begin{bmatrix}I & 0\ S_i & I\end{bmatrix} (q,p), \quad S_i = S_i^T

and similarly for Llow\mathcal{L}^{\rm low} with off-diagonal block variation.

Nonlinear "activation modules" are

Nup:(q,p)[IV 0I](q,p)\mathcal{N}^{\rm up}:(q,p) \mapsto \begin{bmatrix}I & \nabla V\ 0 & I\end{bmatrix}(q,p)

with V(q)V(q) a scalar potential and diagonal block gradient. All such modules, by construction, satisfy symplecticity.

Symplectic convolutional layers replace dense identity blocks with identity-convolutions, and symmetric dense transformations SiS_i with symmetric Toeplitz convolutions. Activation modules use standard pointwise nonlinearities.

3. Symplectic Pooling and Unpooling Layers

Standard max-pooling is non-linear and, in general, not symplectic. The formulation introduces a symplectic pooling operation wherein, at fixed input xRNx \in \mathbb{R}^N, the max-pooling Jacobian Φ(x){0,1}N/k×N\Phi(x) \in \{0,1\}^{N/k \times N} provides a symplectic linearization: pool(x)=Φ(x)x,Φ(x)Φ(x)T=IN/k.\mathrm{pool}(x) = \Phi(x)x, \quad \Phi(x)\Phi(x)^T = I_{N/k}.

Splitting x=(q,p)x = (q,p), the pooling operations are

pup(q,p)=(Φ(q)q,Φ(q)p),plow(q,p)=(Φ(p)q,Φ(p)p)p_{\rm up}(q,p) = (\Phi(q)q,\, \Phi(q)p), \quad p_{\rm low}(q,p) = (\Phi(p)q,\, \Phi(p)p)

each providing symplectic inverses relative to appropriate lifts. The corresponding unpooling operation is analogous, further ensuring that latent representations can be symplectically decoded.

4. Symplectic Autoencoder Architecture

A full symplectic autoencoder assembles the above modules into encoder and decoder compositions. The encoder ψEnc\psi_{\rm Enc} is structured as: ψEnc=ConvActActMConvxflattenpoolMPxxPSD-like projectionMPSD\psi_{\rm Enc} = \text{Conv} \circ \text{Act} \circ \cdots \circ \text{Act}_{\mathcal{M}_{\rm Conv}} \xrightarrow[\phantom{x}]{\text{flatten}} \text{pool}_{\mathcal{M}_{\rm P}} \xrightarrow[\phantom{x}]{\phantom{x}} \text{PSD-like projection}_{\mathcal{M}_{\rm PSD}} The decoder ψDec\psi_{\rm Dec} is the mirror image, utilizing transpose-convolutions, unpooling, and PSD-like lifting. Each constituent layer is individually symplectic; therefore, the entire autoencoder ΨAE=ψDecψEnc\Psi_{\rm AE} = \psi_{\rm Dec} \circ \psi_{\rm Enc} is symplectic by construction (Props. 3.1–3.4 and Def. 14) (Yıldız et al., 27 Aug 2025).

5. Linear Baseline: Proper Symplectic Decomposition (PSD)

For linear structure-preserving autoencoding, proper symplectic decomposition (PSD) provides a baseline. PSD, based on a symplectic SVD variant (Peng & Mohseni 2016), seeks a symplectic basis

A=[Φ0 0Φ],ΦRn×k,ΦTΦ=Ik,A = \begin{bmatrix} \Phi & 0 \ 0 & \Phi \end{bmatrix}, \quad \Phi \in \mathbb{R}^{n \times k}, \,\, \Phi^T \Phi = I_k,

such that ATJA=J2kA^T J A = J_{2k}. Full-state projections zA+zz \mapsto A^+ z yield low-rank, linear, symplectic reductions. This PSD autoencoder is employed as the main linear comparator in all numerical benchmarks.

6. Numerical Performance: Empirical Comparison

Empirical evaluation spans the 1D wave equation, 1D cubic nonlinear Schrödinger (NLS), and 2D sine-Gordon systems, each discretized (e.g., N=1024N=1024 for 1D, 100×100100 \times 100 grid for 2D) and evolved using symplectic time-steppers. The core metrics are relative Frobenius-norm error for reconstruction and time-trace error in latent prediction.

A summary of the key results is presented below.

Case Latent dim rr εPSD\varepsilon_{\rm PSD} εSympCAE\varepsilon_{\rm SympCAE}
1D wave 1 7.28×1017.28 \times 10^{-1} 1.47×1021.47 \times 10^{-2}
2 3.60×1013.60 \times 10^{-1} 1.21×1021.21 \times 10^{-2}
3 7.20×1027.20 \times 10^{-2} 9.25×1039.25 \times 10^{-3}
1D NLS 1 1.85×1011.85 \times 10^{-1} 4.44×1024.44 \times 10^{-2}
2 1.04×1011.04 \times 10^{-1} 1.35×1021.35 \times 10^{-2}
3 5.21×1025.21 \times 10^{-2} 1.82×1021.82 \times 10^{-2}
2D sine-Gordon 1 3.74×1013.74 \times 10^{-1} 1.35×1011.35 \times 10^{-1}
2 3.07×1013.07 \times 10^{-1} 5.15×1025.15 \times 10^{-2}
3 2.55×1012.55 \times 10^{-1} 7.86×1027.86 \times 10^{-2}

Across all cases, the nonlinear symplectic convolutional autoencoder (SympCAE) consistently and significantly outperforms the linear PSD autoencoder, particularly for small latent dimension rr. Time-propagation in latent space via a learned SympNet preserves low time-trace error (ε(t)102\varepsilon(t)\lesssim10^{-2}) (Yıldız et al., 27 Aug 2025).

7. Context, Significance, and Outlook

Symplectic convolutional modules realize structure-preserving, high-capacity feature extraction for Hamiltonian and wave systems, enabling end-to-end learning architectures that do not violate physical invariants. The synthesis of convolutional parameterization, symplectic neural network composition, and pooling operations with symplectic inverses yields a general and extensible toolkit for modeling and forecasting conservative dynamics. The substantial superiority of nonlinear, symplectic autoencoders over linear PSD-based reductions for low-dimensional latent spaces demonstrates their efficacy for practical scientific computing scenarios. This framework is extensible to multi-dimensional and heterogeneous-physics settings, suggesting wide applicability in physics-informed machine learning (Yıldız et al., 27 Aug 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Symplectic Convolutional Modules.