Identify the optimal sampling range γ for random parameters in masking-based approximation
Ascertain the value or range of the uniform sampling half-edge length γ for random initialization that optimally balances (i) the number of hidden units required and (ii) the probability of finding a mask that matches a target parameterization within a given tolerance ε, thereby improving hidden-layer width scaling in the masking-based construction used to prove universal approximation with learned biases.
References
This suggests the existence of some sweet spot in the value γ, which we leave for future work to explore.
— Expressivity of Neural Networks with Random Weights and Learned Biases
(2407.00957 - Williams et al., 2024) in Appendix, Remark 1 on Lemma \ref{lem:supp1}