Activation Bottleneck: Sigmoidal Neural Networks Cannot Forecast a Straight Line
Abstract: A neural network has an activation bottleneck if one of its hidden layers has a bounded image. We show that networks with an activation bottleneck cannot forecast unbounded sequences such as straight lines, random walks, or any sequence with a trend: The difference between prediction and ground truth becomes arbitrary large, regardless of the training procedure. Widely-used neural network architectures such as LSTM and GRU suffer from this limitation. In our analysis, we characterize activation bottlenecks and explain why they prevent sigmoidal networks from learning unbounded sequences. We experimentally validate our findings and discuss modifications to network architectures which mitigate the effects of activation bottlenecks.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.