Symplectic Convolutional Modules
- Symplectic convolutional modules are convolutional layers that preserve canonical symplectic structure, ensuring accurate modeling of Hamiltonian dynamics.
- They integrate tensor techniques and structured Toeplitz matrices with novel parameterization strategies to build scalable, structure-preserving autoencoders.
- Empirical studies on wave, nonlinear Schrödinger, and sine-Gordon systems show that these modules significantly outperform linear PSD-based reductions.
Symplectic convolutional modules are architectural primitives that combine convolutional operations with symplectic structure preservation, specifically formulated for neural networks modeling Hamiltonian dynamics or conservative systems. These modules underpin the construction of symplectic convolutional neural networks (CNNs), wherein each convolutional and pooling layer is explicitly designed to respect the canonical symplectic form, thereby ensuring structure-preserving evolution throughout the network. The methodology integrates tensor techniques, proper symplectic decomposition, and novel parameterization strategies to yield efficient, scalable, and mathematically rigorous neural models for high-dimensional, Hamiltonian, and wave-like partial differential equations (Yıldız et al., 27 Aug 2025).
1. Mathematical Foundation of Symplectic Convolutional Layers
A fundamental goal is to recast standard discrete convolutions as linear symplectic maps. For a 1D multi-channel input —flattened as —the convolutional output can be written as
where is a block matrix constructed from (zero-padded) Toeplitz matrices . The symplecticity constraint for a real matrix imposes , with .
A sufficient parametric form for a symplectic (with ) is
with symmetric, guaranteeing . Stacking such blocks enables arbitrary (even non-square) channel manipulations while preserving symplecticity.
Families of structured matrices are introduced for generalization:
- : all Toeplitz matrices,
- : symmetric Toeplitz,
- : block-Toeplitz with blocks,
- : block-symmetric,
- Analogous constructions for multi-dimensional cases.
Symplectic convolutional lifting matrices are defined (Def. 7) to provide a formal foundation for the encoder and decoder blocks.
2. Parameterization via Symplectic Neural Networks (SympNets)
The SympNet approach composes deep stacks of simple symplectic maps, extending beyond simple constrained convolutional blocks. The linear–activation SympNet (LA-SympNet) alternates:
- Linear modules: Employ symmetric Toeplitz (or block-Toeplitz) convolutional operators,
- Activation modules: Use potential-based pointwise nonlinearities.
Linear modules take the form
and similarly for with off-diagonal block variation.
Nonlinear "activation modules" are
with a scalar potential and diagonal block gradient. All such modules, by construction, satisfy symplecticity.
Symplectic convolutional layers replace dense identity blocks with identity-convolutions, and symmetric dense transformations with symmetric Toeplitz convolutions. Activation modules use standard pointwise nonlinearities.
3. Symplectic Pooling and Unpooling Layers
Standard max-pooling is non-linear and, in general, not symplectic. The formulation introduces a symplectic pooling operation wherein, at fixed input , the max-pooling Jacobian provides a symplectic linearization:
Splitting , the pooling operations are
each providing symplectic inverses relative to appropriate lifts. The corresponding unpooling operation is analogous, further ensuring that latent representations can be symplectically decoded.
4. Symplectic Autoencoder Architecture
A full symplectic autoencoder assembles the above modules into encoder and decoder compositions. The encoder is structured as: The decoder is the mirror image, utilizing transpose-convolutions, unpooling, and PSD-like lifting. Each constituent layer is individually symplectic; therefore, the entire autoencoder is symplectic by construction (Props. 3.1–3.4 and Def. 14) (Yıldız et al., 27 Aug 2025).
5. Linear Baseline: Proper Symplectic Decomposition (PSD)
For linear structure-preserving autoencoding, proper symplectic decomposition (PSD) provides a baseline. PSD, based on a symplectic SVD variant (Peng & Mohseni 2016), seeks a symplectic basis
such that . Full-state projections yield low-rank, linear, symplectic reductions. This PSD autoencoder is employed as the main linear comparator in all numerical benchmarks.
6. Numerical Performance: Empirical Comparison
Empirical evaluation spans the 1D wave equation, 1D cubic nonlinear Schrödinger (NLS), and 2D sine-Gordon systems, each discretized (e.g., for 1D, grid for 2D) and evolved using symplectic time-steppers. The core metrics are relative Frobenius-norm error for reconstruction and time-trace error in latent prediction.
A summary of the key results is presented below.
| Case | Latent dim | ||
|---|---|---|---|
| 1D wave | 1 | ||
| 2 | |||
| 3 | |||
| 1D NLS | 1 | ||
| 2 | |||
| 3 | |||
| 2D sine-Gordon | 1 | ||
| 2 | |||
| 3 |
Across all cases, the nonlinear symplectic convolutional autoencoder (SympCAE) consistently and significantly outperforms the linear PSD autoencoder, particularly for small latent dimension . Time-propagation in latent space via a learned SympNet preserves low time-trace error () (Yıldız et al., 27 Aug 2025).
7. Context, Significance, and Outlook
Symplectic convolutional modules realize structure-preserving, high-capacity feature extraction for Hamiltonian and wave systems, enabling end-to-end learning architectures that do not violate physical invariants. The synthesis of convolutional parameterization, symplectic neural network composition, and pooling operations with symplectic inverses yields a general and extensible toolkit for modeling and forecasting conservative dynamics. The substantial superiority of nonlinear, symplectic autoencoders over linear PSD-based reductions for low-dimensional latent spaces demonstrates their efficacy for practical scientific computing scenarios. This framework is extensible to multi-dimensional and heterogeneous-physics settings, suggesting wide applicability in physics-informed machine learning (Yıldız et al., 27 Aug 2025).