Predictive accuracy of Potts models for overlapping-gene sequences far from training distributions
Establish the empirical predictive accuracy of Direct Coupling Analysis Potts models trained on multiple sequence alignments when applied as fitness proxies to de novo designed overlapping-gene sequences that deviate substantially from the training distribution.
References
Most notably, while Potts models have been validated experimentally for single protein families , their accuracy as fitness predictors for sequences that deviate substantially from the training distribution -- as our overlapping sequences necessarily do -- remains uncertain.
— The fitness landscape of overlapping genes
(2604.00602 - Kirsch et al., 1 Apr 2026) in Discussion