Determine the optimal spatial reduction strategy for applying CODEC to Vision Transformers
Determine the optimal spatial reduction strategy for Vision Transformers when applying Contribution Decomposition (CODEC), including how to aggregate token-level information into pseudo-channels for contribution computation and sparse autoencoder decomposition beyond the heuristic of treating tokens as spatial positions and summing over tokens.
References
We leave an exploration of the optimal spatial reduction strategy for ViTs to future work.
— Causal Interpretation of Neural Network Computations with Contribution Decomposition
(2603.06557 - Melander et al., 6 Mar 2026) in Supplemental Material, Section “CODEC on ViTs,” concluding sentence of the overview preceding Sparsity/Correlation/Perturbation analyses