On the biological plausibility of orthogonal initialisation for solving gradient instability in deep neural networks

Published 27 Oct 2022 in cs.NE, cs.AI, and cs.LG | (2211.08408v1)

Abstract: Initialising the synaptic weights of artificial neural networks (ANNs) with orthogonal matrices is known to alleviate vanishing and exploding gradient problems. A major objection against such initialisation schemes is that they are deemed biologically implausible as they mandate factorization techniques that are difficult to attribute to a neurobiological process. This paper presents two initialisation schemes that allow a network to naturally evolve its weights to form orthogonal matrices, provides theoretical analysis that pre-training orthogonalisation always converges, and empirically confirms that the proposed schemes outperform randomly initialised recurrent and feedforward networks.