Do deeper nonlinear decoders outperform the temporal-attention MLP on TVSD?
Determine whether more complex nonlinear architectures, such as deeper recurrent, convolutional, or transformer-based spatiotemporal networks, can achieve higher decoding performance than the proposed temporal-attention multilayer perceptron (MLP) when mapping 200 ms windows of primate multi-unit activity from the THINGS Ventral Stream Spiking Dataset to semantic image embeddings (e.g., CLIP) for visual decoding.
References
Fourth, we cannot rule out the possibility that more complex, nonlinear architectures might achieve higher decoding performance.
— Simple Models, Rich Representations: Visual Decoding from Primate Intracortical Neural Signals
(2601.11108 - Ciferri et al., 16 Jan 2026) in Limitations subsection, Discussion