- The paper introduces LAG-XAI, an affine-Lie framework that models paraphrasing as global affine transformations in Transformer embedding spaces.
- The paper decomposes semantic transitions into rotation, deformation, and translation, revealing invariant descriptors such as rotation angle and deformation index.
- The paper demonstrates robust performance in anomaly detection and cross-domain generalization, validated through empirical evaluation on multiple datasets.
Introduction
This work introduces LAG-XAI, an affine-Lie geometric framework for interpretable paraphrasing in Transformer latent spaces. By formulating paraphrasing as a global affine transformation in the embedding space, LAG-XAI provides a continuum-theoretic and parametric mechanism for understanding semantic transitions, grounded in Lie group theory. The approach decomposes semantic changes into rotation, deformation, and translation within a high-dimensional manifold, thereby yielding interpretable XAI descriptors. Empirical evaluation is carried out on social media data (PIT-2015, TURL) using SBERT embeddings, with additional validation on hallucination detection (HaluEval).
LAG-XAI posits that semantic paraphrasing is instantiated as an affine transformation T(x)=Ax+t, approximating the local action of the Lie group Aff(n) on sentence embedding manifolds. Under the manifold hypothesis, semantic transitions are modeled as integral flows along an invariant vector field, leveraging the spatial homogeneity of transformer-induced semantic subspaces.
Lie group-inspired techniques, including polar decomposition and principal component analysis (PCA) of drift vectors, provide empirical surrogates for infinitesimal generators (Lie algebra elements). The transformation matrix A and translation vector t are estimated through geometrically regularized least-squares, incorporating both isometry-promoting Procrustes alignment and constraint to empirical semantic drift directions.
Mechanistic Interpretability and Invariant Extraction
Unlike static similarity metrics, the affine model supports the decomposition of paraphrase transitions into orthogonal (rotation R), symmetric (deformation S), and translation (t) components. The statistical invariants—rotation angle θ, deformation index (Def), translation norm (Shift), and determinant sign (chirality)—yield a structured XAI profile of each semantic transformation.
Key empirical phenomena include:
- Linear Transparency: Approximately 80% of SBERT’s effective paraphrase discrimination capacity arises from a single affine transformation, indicating a linearizable structure in transformer latent spaces.
- Local Isometry: The semantic deformation index approaches zero under optimal regularization, confirming that legitimate paraphrasing corresponds to isometries—volume-conserving transformations—on the semantic manifold.
- Geometric Constant: The mean structural reconfiguration angle (θ≈27.84∘) is nearly invariant across corpora and random seeds.
- Semantic Chirality: A negative determinant (det(A) < 0) reveals that logically invertive operations (e.g., active-passive transformations, argument inversion) correspond to mirror reflections in embedding space.
Empirical Results and Strong Numerical Findings
On the PIT-2015 dev set, LAG-XAI's affine operator achieves an AUC of 0.7713 compared to the SBERT cosine baseline of 0.8405, capturing approximately 80% of the possible gain relative to random chance. The deformation index is tightly bounded (Def Aff(n)0 0.00025), substantiating the local isometry hypothesis.
The framework robustly generalizes cross-domain:
- Global consensus operators trained on PIT-2015 generalize to TURL with minimal performance degradation, indicating that identified invariants characterize intrinsic, architecture-dependent properties of the semantic manifold.
- Domain-specific (local) models exhibit severe overfitting and poor generalization, emphasizing the necessity of global invariant modeling.
Hallucination detection via geometric anomaly is highly effective: on HaluEval, LAG-XAI identifies 95.3% of hallucinated samples (F1 = 92.8%), in a strict zero-shot regime, solely with geometric error thresholds.
Theoretical and Practical Implications
LAG-XAI’s geometric perspective reifies semantic similarity as structured, decomposable motion, rather than opaque pointwise distances. The explicit identification of invariant directions and geometric boundaries foundationally supports:
- Geometrically equivariant Transformer architectures: Loss functions can regularize toward learned Lie subspaces, facilitating inherent paraphrase-invariance and robustness.
- Automated anomaly and hallucination detection: The "cheap geometric check" is computationally lightweight and effective for online monitoring, obviating expensive autoregressive LLM querying.
- Controlled latent space augmentation: Embedding adjustment via (A, t) allows continuous, interpretable generation and stylization without post-hoc token manipulation.
The distinction between rotation (syntactic restructuring) and translation (pragmatic drift) additionally enables fine-grained control and auditing in XAI settings.
Limitations
The affine model is fundamentally a first-order local approximation; long-range or highly non-linear transformations (deep summarization, multi-paragraph restructuring) are not encompassed. PCA- and Procrustes-based generator estimation provides empirically tractable, but mathematically approximate, surrogacy for true Lie algebraic mapping, particularly in high dimensions (Aff(n)1). Generality claims are currently supported for SBERT-based architectures and English; further work is needed for decoder-only LLMs and morphologically rich languages.
Future Directions
Subsequent research should pursue:
- Integration of geometric invariants into LLM training, yielding paraphrase-equivariant or -invariant architectures.
- Extension to piecewise-affine or higher-order deformation models for capturing strong non-linearities.
- Validation across architectures and languages, potentially extracting universal geometric constants for semantic manifold modeling.
- Real-time integration into LLM pipelines for dynamic interruption of hallucinations during generation, enforcing semantic guardrails.
Conclusion
LAG-XAI demonstrates that structured affine Lie group actions can explain a substantial portion of semantic motion in Transformer embedding spaces. This approach yields explicit, interpretable geometric invariants, achieves high accuracy and robustness under resource constraints, and supports both mechanistic interpretability and practical anomaly detection. The framework lays the groundwork for a geometric theory of NLP tasks, facilitating the transition from opaque statistical measures to controllable, physics-informed semantics operating directly in latent space.
Reference:
“LAG-XAI: A Lie-Inspired Affine Geometric Framework for Interpretable Paraphrasing in Transformer Latent Spaces” (2604.06086)