Scalable Dictionary Learning for Sparse Inference Under Superposition
Develop scalable dictionary learning algorithms for sparse inference under superposition that learn overcomplete decoders/dictionaries enabling accurate recovery of sparse latent factors from neural activations within the compressed-sensing regime.
References
Our results reframe the SAE failure as a dictionary learning challenge, not an amortisation problem, and point to scalable dictionary learning as the key open problem for sparse inference under superposition.
— Stop Probing, Start Coding: Why Linear Probes and Sparse Autoencoders Fail at Compositional Generalisation
(2603.28744 - Pacela et al., 30 Mar 2026) in Abstract