Tsururu: A Python-based Time Series Forecasting Strategies Library

Published 19 Sep 2025 in cs.LG | (2509.15843v1)

Abstract: While current time series research focuses on developing new models, crucial questions of selecting an optimal approach for training such models are underexplored. Tsururu, a Python library introduced in this paper, bridges SoTA research and industry by enabling flexible combinations of global and multivariate approaches and multi-step-ahead forecasting strategies. It also enables seamless integration with various forecasting models. Available at https://github.com/sb-ai-lab/tsururu .

Abstract PDF Upgrade to Chat

Summary

The paper presents a modular Python library that systematically benchmarks combinations of models, forecasting strategies, and preprocessing pipelines for time series data.
It employs diverse forecasting approaches—including Recursive, MIMO, and hybrid Rec-MIMO—to optimize multi-step predictions with superior performance on the ILI dataset.
Experimental results highlight the impact of innovative techniques like window-based normalization and global modeling in enhancing forecasting accuracy.

Tsururu: A Modular Python Library for Time Series Forecasting Strategies

Motivation and Context

The Tsururu library addresses a critical gap in the time series forecasting ecosystem: the lack of flexible, modular frameworks that allow practitioners and researchers to systematically ablate and benchmark combinations of models, forecasting strategies, and preprocessing pipelines. While existing libraries such as Darts, sktime, gluonts, and neuralforecast provide robust implementations of state-of-the-art (SoTA) models, they often restrict users to fixed forecasting strategies, offer limited support for exogenous variables, or lack the ability to handle non-aligned and heterogeneous datasets. Tsururu is designed to overcome these limitations by enabling all-with-all combinations of global/multivariate approaches, multi-step-ahead forecasting strategies, and preprocessing methods, thus facilitating both fair benchmarking and practical deployment in industrial settings.

Framework Architecture and Design

Tsururu's architecture is highly modular, supporting the following key components:

Multi-series Prediction Approaches: Both global (single model across all series) and multivariate (modeling inter-series dependencies) approaches are supported for all models. Deep learning models further allow for Channel Independence (CI) and Channel Mixing (CM) modes, controlling the degree of inter-series interaction.
Forecasting Strategies: The library implements a comprehensive suite of multi-step-ahead forecasting strategies, including:
- Recursive (Rec): Iterative one-step-ahead predictions.
- Recursive-MIMO (Rec-MIMO): Hybrid approach generating multiple steps per iteration.
- Direct (Dir): Separate models for each forecast horizon step.
- MIMO: Single model predicts the entire forecast horizon in one shot.
- FlatWideMIMO (FWM): Model predicts a specific horizon index, provided as an input feature.
Pipeline and Data Transformations: Tsururu's pipeline supports sequential transformations, including:
- Series-to-Series: Preprocessing and feature generation.
- Series-to-Features: Construction of wide matrices with lagged features.
- Features-to-Features: Window-based processing, notably the LastKnownNormalizer (LKN), which normalizes values by the most recent observation in history.
- Separate transformations for features and targets are supported, enhancing flexibility.
Model Integration: The library includes both classical ML (e.g., CatBoost, SketchBoost) and DL models (DLinear, CycleNet, TimesNet, PatchTST, GPT4TS), with a unified interface for extending to new models.
Training and Validation: The Trainer module manages training, cross-validation, early stopping, and robust evaluation via backtesting and rolling validation.

Experimental Evaluation

Setup

Experiments were conducted on the ILI dataset, a challenging multivariate time series with strong periodicity and nontrivial temporal structure. The evaluation systematically ablated combinations of models (SketchBoost, DLinear, PatchTST, GPT4TS, CycleNet), approaches (global, multivariate CI/CM), and forecasting strategies (Recursive, MIMO, FlatWideMIMO). Hyperparameters were fixed to standard values from the literature, with a cosine learning rate scheduler, batch size 32, history length 96, and forecast horizon 24.

Key Findings

Preprocessing: The LastKnownNormalizer (LKN), rarely used in existing libraries, significantly outperformed standard normalization techniques, as evidenced by critical difference diagrams and ablation studies. This result highlights the importance of window-based normalization for handling local distributional shifts.
Feature Engineering: Inclusion of ID features improved model accuracy, while date features degraded performance for both neural networks and GBDT models.
Approach and Strategy: The global approach consistently outperformed the multivariate approach across all models. For neural networks, the MIMO strategy achieved the best median MAE, while for GBDT models, the Rec-MIMO (MH=6) strategy was optimal. Notably, FlatWideMIMO combined with boosting models was highly competitive, challenging the prevailing assumption that GBDT models are suboptimal for multi-step forecasting.
Model Performance: GPT4TS achieved the lowest test MAE with the Rec-MIMO strategy, while MIMO was best on the validation set. PatchTST and DLinear also performed strongly, but the diversity of top-ranked model-strategy combinations underscores the necessity of systematic ablation.
Reproducibility: Tsururu's implementations closely matched or improved upon published results for all models on the ILI dataset, demonstrating high fidelity and reliability.

Practical and Theoretical Implications

Tsururu's design enables practitioners to explore the full combinatorial space of models, strategies, and preprocessing methods, which is essential for both fair benchmarking and real-world deployment. The empirical results challenge several common assumptions in the field:

Non-default strategies and preprocessing can yield substantial gains over standard pipelines, particularly in industrial settings with heterogeneous, non-aligned, or short time series.
Global modeling approaches remain highly competitive even in the presence of strong inter-series dependencies, especially when combined with appropriate normalization and feature engineering.
Hybrid and rarely used forecasting strategies (e.g., Rec-MIMO, FlatWideMIMO) are viable alternatives to standard MIMO or recursive approaches, especially for boosting models.

Theoretically, the results suggest that the choice of forecasting strategy and preprocessing pipeline can be as important as model architecture, and that systematic ablation is necessary to uncover optimal configurations.

Future Directions

The authors propose several extensions, including:

Incorporation of additional forecasting strategies (Rectify, DirRec).
Development of a universal neural network constructor.
Support for mixed discretization (daily, monthly, weekly) within multivariate datasets.
Integration of patching techniques and further expansion of preprocessing options.

These directions will further enhance the library's utility for both research and industrial applications.

Conclusion

Tsururu provides a comprehensive, modular framework for time series forecasting, enabling systematic exploration of model, strategy, and preprocessing combinations. Its empirical results demonstrate the value of rarely used strategies and normalization techniques, and its architecture is well-suited for both benchmarking and deployment in complex, real-world scenarios. The library's extensibility and reproducibility make it a valuable tool for advancing both the science and practice of time series forecasting.

Markdown Report Issue