Structured First-Layer Initialization Pre-Training Techniques to Accelerate Training Process Based on $\varepsilon$-Rank

Published 16 Jul 2025 in math.NA and cs.NA | (2507.11962v1)

Abstract: Training deep neural networks for scientific computing remains computationally expensive due to the slow formation of diverse feature representations in early training stages. Recent studies identify a staircase phenomenon in training dynamics, where loss decreases are closely correlated with increases in $\varepsilon$-rank, reflecting the effective number of linearly independent neuron functions. Motivated by this observation, this work proposes a structured first-layer initialization (SFLI) pre-training method to enhance the diversity of neural features at initialization by constructing $\varepsilon$-linearly independent neurons in the input layer. We present systematic initialization schemes compatible with various activation functions and integrate the strategy into multiple neural architectures, including modified multi-layer perceptrons and physics-informed residual adaptive networks. Extensive numerical experiments on function approximation and PDE benchmarks, demonstrate that SFLI significantly improves the initial $\varepsilon$-rank, accelerates convergence, mitigates spectral bias, and enhances prediction accuracy. With the help of SILP, we only need to add one line of code to conventional existing algorithms.