Analytic expression for the NNGP kernel with tanh activation

Derive a closed-form analytic expression for the normalized covariance kernel κ(u) corresponding to the infinite-width Gaussian process (NNGP) limit of a fully connected neural network with hyperbolic tangent activation σ(x) = tanh(x), i.e., κ(u) = E[tanh(Z1) tanh(Z2)] / E[tanh(Z)^2] as a function of the correlation u ∈ [-1,1], where (Z1, Z2) are standard Gaussian random variables with correlation u. Establishing this expression would provide an explicit formula for the depth-1 limiting kernel K1(x,y) = κ(⟨x,y⟩) used to generate deeper kernels by composition.

Background

In the infinite-width limit, fully connected neural networks with i.i.d. Gaussian weights converge to Gaussian processes characterized by covariance kernels of the form K(x,y) = κ(⟨x,y⟩). For many activations (e.g., ReLU and Gaussian), κ(u) is known in closed form, enabling detailed spectral analysis and exact characterization of depth-dependent behavior.

For the hyperbolic tangent activation σ(x) = tanh(x), the paper establishes that κ′(1) > 1, which suffices to classify tanh networks in the high-disorder regime, but it explicitly notes the absence of an analytic formula for κ(u). A closed-form expression would enable precise spectral computations, facilitate exact moment calculations of the associated spectral law, and potentially refine the complexity analysis presented in the paper.

References

The associated kernel is not known analytically (to the best of our knowledge) but \Cref{der_sigmoide} shows that the derivative at the origin is greater than one.

— Spectral complexity of deep neural networks (2405.09541 - Lillo et al., 2024) in Section 4 (Numerical evidence), High-disorder case paragraph

Analytic expression for the NNGP kernel with tanh activation

Background

References

Related Problems