Papers
Topics
Authors
Recent
Search
2000 character limit reached

The Curvature Rate λ: A Scalar Measure of Input-Space Sharpness in Neural Networks

Published 3 Nov 2025 in cs.LG | (2511.01438v1)

Abstract: Curvature influences generalization, robustness, and how reliably neural networks respond to small input perturbations. Existing sharpness metrics are typically defined in parameter space (e.g., Hessian eigenvalues) and can be expensive, sensitive to reparameterization, and difficult to interpret in functional terms. We introduce a scalar curvature measure defined directly in input space: the curvature rate λ, given by the exponential growth rate of higher-order input derivatives. Empirically, λ is estimated as the slope of log ||Dn f|| versus n for small n. This growth-rate perspective unifies classical analytic quantities: for analytic functions, λ corresponds to the inverse radius of convergence, and for bandlimited signals, it reflects the spectral cutoff. The same principle extends to neural networks, where λ tracks the emergence of high-frequency structure in the decision boundary. Experiments on analytic functions and neural networks (Two Moons and MNIST) show that λ evolves predictably during training and can be directly shaped using a simple derivative-based regularizer, Curvature Rate Regularization (CRR). Compared to Sharpness-Aware Minimization (SAM), CRR achieves similar accuracy while yielding flatter input-space geometry and improved confidence calibration. By grounding curvature in differentiation dynamics, λ provides a compact, interpretable, and parameterization-invariant descriptor of functional smoothness in learned models.

Summary

  • The paper introduces the curvature rate λ as a scalar measure for input-space sharpness, shifting focus from parameter-space metrics.
  • It utilizes differentiation dynamics to quantify exponential growth of input derivatives, validated through experiments on Two Moons and MNIST.
  • The study shows that curvature rate regularization boosts confidence calibration without sacrificing accuracy, simplifying network tuning.

The Curvature Rate λ: A Scalar Measure of Input-Space Sharpness in Neural Networks

Introduction to Curvature Rate λ

The paper "The Curvature Rate λ: A Scalar Measure of Input-Space Sharpness in Neural Networks" presents a novel scalar measure for assessing curvature in neural networks, specifically transitioning the focus from parameter-space metrics to input-space dynamics. Traditionally, curvature has been analyzed using Hessian eigenvalues in parameter space, which are computationally expensive and sensitive to reparameterization. The proposed curvature rate λ\lambda offers a direct input-space alternative based on the exponential growth rate of higher-order input derivatives, facilitating an interpretable and parameterization-invariant approach to quantify functional sharpness.

Theoretical Framework

Differentiation Dynamics

Curvature is characterized through differentiation dynamics, reflecting how a function evolves under repeated differentiation. The curvature rate λ\lambda is mathematically formalized as:

λX(f)=lim supn1nlogvn\lambda_X(f) = \limsup_{n\to\infty} \frac{1}{n} \log v_n

where vn=DnfXv_n = \|D^n f\|_X represents the norm of the nn-th input derivative, typically estimated using small nn. A positive λ\lambda indicates the exponential growth of derivatives, suggesting high-frequency features in the decision boundary of neural networks.

Classical Connections

This formulation unifies multiple classical concepts:

  • Analytic Functions: For functions such as (1x)1(1-x)^{-1}, λ\lambda correlates with the inverse radius of convergence.
  • Bandlimited Signals: For signals with spectral cutoff Ω\Omega, λ\lambda equals logΩ\log \Omega.

These parallels support the hypothesis that λ\lambda can effectively describe geometric properties across various function classes, extending to neural networks.

Experimental Validation

Neural Networks: Two Moons and MNIST

Empirical validation is performed on analytic functions and neural network benchmarks like Two Moons and MNIST. On Two Moons, models with 30% label noise demonstrated a systematic difference in λ\lambda values when trained with and without regularization, revealing insights into late-stage overfitting dynamics: Figure 1

Figure 1: Training dynamics reveal that unregularized models continue sharpening after generalization plateaus.

In contrast, MNIST evaluations highlight that a dataset's intrinsic curvature scale determines the optimal λ\lambda range, with negative λ\lambda values signaling relative smoothness suitable for MNSIT's visual nature.

Practical Effects of Curvature Rate Regularization

Comparison to SAM

CRR was juxtaposed with Sharpness-Aware Minimization (SAM), indicating that while both methods achieve similar test accuracy, CRR significantly enhances confidence calibration by directly influencing input-space geometry rather than parameter-space changes:

Table: Comparison of Sharpness-Aware Minimization (SAM) and Curvature Rate Regularization (CRR) on MNIST.

The ability to modulate curvature directly through CRR suggests pathways for enhancing not only accuracy but also the reliability of confidence estimates, which is crucial for applications requiring high degrees of trust in model predictions.

Discussion and Implications

The findings indicate that curvature rate λ\lambda serves as an effective descriptor of input-space smoothness, distinct from traditional parameter-space flatness metrics. The regularization achieved through CRR fundamentally adjusts how models perceive input variations, enhancing calibration without compromising on accuracy. The broader implication is the simplification of complex Hessian metrics into a single interpretable scalar, potentially aiding in more straightforward tuning and monitoring during training.

Conclusion

By establishing λ\lambda as a scalar measure of curvature and illustrating its practical utility through CRR, the paper effectively bridges analytic principles with neural network dynamics. This scalar measure not only offers insights into generalization and model calibration but also suggests new research directions in robustness and task-dependent tuning strategies.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.