Learning Recursive Multi-Scale Representations for Irregular Multivariate Time Series Forecasting

Published 25 Feb 2026 in cs.LG | (2602.21498v1)

Abstract: Irregular Multivariate Time Series (IMTS) are characterized by uneven intervals between consecutive timestamps, which carry sampling pattern information valuable and informative for learning temporal and variable dependencies. In addition, IMTS often exhibit diverse dependencies across multiple time scales. However, many existing multi-scale IMTS methods use resampling to obtain the coarse series, which can alter the original timestamps and disrupt the sampling pattern information. To address the challenge, we propose ReIMTS, a Recursive multi-scale modeling approach for Irregular Multivariate Time Series forecasting. Instead of resampling, ReIMTS keeps timestamps unchanged and recursively splits each sample into subsamples with progressively shorter time periods. Based on the original sampling timestamps in these long-to-short subsamples, an irregularity-aware representation fusion mechanism is proposed to capture global-to-local dependencies for accurate forecasting. Extensive experiments demonstrate an average performance improvement of 27.1\% in the forecasting task across different models and real-world datasets. Our code is available at https://github.com/Ladbaby/PyOmniTS.

Abstract PDF Upgrade to Chat

Summary

The paper introduces ReIMTS, a framework that recursively splits time series into domain-informed periods and fuses local and global representations using mask-aware attention.
It leverages backbone-agnostic integration to enhance existing models, achieving an average 27.1% improvement in MSE over state-of-the-art baselines on diverse datasets.
The approach ensures computational efficiency and scalability while preserving critical sampling patterns, paving the way for adaptive timescale selection and causal discovery.

Recursive Multi-Scale Representation Learning for Irregular Multivariate Time Series Forecasting

Introduction

Irregular Multivariate Time Series (IMTS) are ubiquitous in domains such as healthcare, climate science, and biomechanics, featuring non-uniform sampling intervals and entity-specific observation patterns. Accurate forecasting in this setting necessitates explicit modeling of both the temporal irregularity and the multi-scale dependencies inherent in real-world phenomena. Existing multi-scale IMTS forecasting methods predominantly employ resampling or patching, inadvertently disrupting original sampling patterns and potentially obfuscating critical irregularity cues. The paper "Learning Recursive Multi-Scale Representations for Irregular Multivariate Time Series Forecasting" (2602.21498) introduces ReIMTS, a recursive, multi-scale modeling framework that addresses these limitations by directly learning from the original, unmodified sampling patterns using a hierarchy of preserved time-period-based splits.

ReIMTS Architecture and Methodology

ReIMTS implements recursive decomposition of each IMTS sample into subsamples defined by progressively shorter, domain-informed time periods without altering the original timestamps. At each scale, a backbone (e.g., GNN, RNN, set-based, or transformer-based IMTS encoder) infers latent representations from subsamples. These representations are recursively fused by an irregularity-aware mechanism, ensuring global information from coarser temporal views is adaptively incorporated into local representations at finer scales, with explicit handling of missingness and observation masks.

The architecture is illustrated in the following figure:

Figure 1: Architecture of ReIMTS with recursive splitting, backbone-based encoding at each scale, and hierarchical fusion of multi-scale representations for forecasting.

Key methodological components include:

Recursive Time Period Splitting: Original IMTS samples are hierarchically decomposed along time, producing a tree of subsample sets at increasingly granular scales. Splits never occur by count of observations—only real-world time periods, preserving the temporal spacing and sampling densities across all variables.
Irregularity-Aware Representation Fusion: At each scale, local encodings are fused with global representations from the preceding (coarser) scale by an attention mechanism, where the fusion score is conditioned on the mask identifying observed (true) versus padded values. This is critical for leveraging information from both dense and sparse observation windows and robust propagation of sampling pattern semantics.
Backbone Agnostic Integration: The framework is compatible with a broad spectrum of IMTS backbones, extracting temporal, variable, or observation-centric representations as appropriate. This allows extension of existing SOTA models by simply wrapping them with the ReIMTS decomposition/fusion protocol.

A comparative schematic against resampling and patch-based methods is shown here:

Figure 2: Comparative overview of ReIMTS and representative multi-scale methods, highlighting its preservation of sampling patterns and universal backbone compatibility.

Experimental Results

ReIMTS is evaluated on five real-world IMTS datasets (MIMIC-III, MIMIC-IV, PhysioNet'12, Human Activity, USHCN), encompassing diverse domains with varying maximal sequence lengths and variable counts. It is benchmarked against 26 baselines, including both irregular and regular time series forecasting models—multi-scale and single-scale, as well as domain-specific and backbone-agnostic SOTA approaches.

Key findings:

Sustained and Significant Improvement: ReIMTS delivers a mean improvement of 27.1% in MSE over SOTA IMTS baselines, and robustly boosts legacy backbones (e.g., mTAN, GRU-D) as well as recent architectures (e.g., PrimeNet, GraFITi, TimeCHEAT).
Sample-Efficient and Computationally Scalable: The memory and training time overhead imposed by multi-scale processing is moderate; ReIMTS is more efficient than other multi-scale IMTS methods such as Warpformer, Hi-Patch, and HD-TTS. When equipped with lightweight backbones, the method achieves SOTA accuracy with minimal resource usage.
Ablation Analysis: Both the recursive splitting by time periods (not observation count) and the irregularity-aware fusion layer are indispensable. Each variant that replaces either component with a naive strategy (uniform split, simple addition instead of mask-aware attention) substantially degrades accuracy.

Efficiency Analysis

Figure 3: Efficiency analysis on MIMIC-IV dataset, showing ReIMTS (with GraFITi backbone) offers better trade-off in MSE, GPU memory usage, and training time compared to other multi-scale approaches.

Across datasets, ReIMTS demonstrates that the recursive multi-scale design does not linearly increase memory or time cost, thanks to parallelizable, mask-aware batch operations and efficient fusion. This property is essential for scalability on large clinical or environmental datasets where variable counts and observation counts per series are high.

Qualitative Analysis

Figure 4: t-SNE visualization demonstrates improved geometric separation of class clusters in learned representations when multi-scale decomposition is applied (right) versus the single-scale backbone (left).

Qualitative visualizations of learned representations underscore that the recursive multi-scale approach yields sharper class boundaries and better preserves critical information for downstream classification tasks, supporting the empirical superiority in forecasting as observed in tabular results.

Implications and Future Directions

The explicit modeling of sampling pattern information at all resolutions enhances both predictive accuracy and interpretability. Theoretical strengths include robustness to missingness, adaptability to domain-specific timescales, and backbone-agnostic extension. Practically, ReIMTS is especially suitable for deployment in data modalities with heterogeneous observation design, such as intensive care monitoring, large-scale climatology, and longitudinal cohort studies. The plug-and-play nature eases integration in existing IMTS pipelines.

Several research directions are immediate:

Extension to ODE and diffusion-based backbones: While compatible with encoder-decoder backbones, further theoretical development is required for seamless integration with continuous-time generative models or latent diffusion architectures.
Autonomous Timescale Selection: Current splits are set by domain knowledge; adaptive, data-driven determination of optimal time resolutions (e.g., via differentiable splitting policies) is an open question.
Causal Representation Learning: The clear disentanglement of global-local representations at each scale positions ReIMTS as a candidate for causal discovery under temporal and sampling irregularity.

Conclusion

ReIMTS demonstrates that recursively learning multi-scale representations with preserved sampling patterns fundamentally advances the performance frontier in IMTS forecasting. Its empirical advantages, computational efficiency, and architectural universality mark it as a substantive contribution to the field of time series modeling under real-world constraints. Potential for combination with a broader class of backbone models and for adaptive temporal hierarchy selection will drive further refinement and adoption.

Markdown Report Issue