Large-scale pre-training and zero-shot forecasting for Seg-MoE
Investigate large-scale pre-training strategies for the segment-wise Mixture-of-Experts architecture Seg-MoE and evaluate its zero-shot forecasting performance on long-term multivariate time-series benchmarks, determining whether pre-trained Seg-MoE models can perform accurate zero-shot forecasting without task-specific fine-tuning.
References
We note that all results are obtained with no pre-training. Investigating large-scale pre-training and zero-shot forecasting for is a promising direction, but we leave it for future work.
— Seg-MoE: Multi-Resolution Segment-wise Mixture-of-Experts for Time Series Forecasting Transformers
(2601.21641 - Ortigossa et al., 29 Jan 2026) in Appendix: Additional Experimental Results, Subsection "Additional Baselines"