Papers
Topics
Authors
Recent
Search
2000 character limit reached

Non-Stationary Markov Process

Updated 16 January 2026
  • Non-stationary Markov processes are stochastic models with time-dependent probabilities that accurately capture systems with changing dynamics and external influences.
  • They are formulated in both discrete and continuous frameworks, using time-indexed transition matrices or generator matrices to model abrupt shifts, periodicity, and smooth drifts.
  • Their robust modeling underpins applications in epidemiology, queueing theory, reinforcement learning, and statistical physics, driving novel computational and control schemes.

A non-stationary Markov process is a stochastic process in which the Markov property holds, but the transition probabilities or rates depend explicitly on time or on other evolving external variables. Unlike classical stationary Markov processes, where the law of evolution is time-homogeneous, non-stationary variants capture systems with dynamically changing environments, parameters, or underlying structures. This generalization is fundamental in accurately describing complex real-world phenomena that exhibit temporal heterogeneity, abrupt regime shifts, or gradual evolution, and underpins a large spectrum of modeling and algorithmic approaches in statistical physics, epidemiology, queueing, control, and reinforcement learning.

1. Mathematical Formulations

Discrete and Continuous-Time Models

A discrete-time Markov decision process (MDP) with time-dependent transitions is formally defined as the tuple (S,A,{Pt},{Rt},γ)(\mathcal{S},\,\mathcal{A},\,\{P_t\},\,\{R_t\},\,\gamma), where for each t0t\ge0,

Pt(ss,a)=Pr[St+1=sSt=s,At=a]P_t(s' \mid s,a) = \Pr\bigl[S_{t+1}=s' \mid S_t=s,\,A_t=a\bigr]

and Rt(s,a)R_t(s,a) is the immediate reward function (Keplinger et al., 16 Jan 2025). The core characteristic is that either PtP_t or RtR_t (or both) depend on tt or on an exogenous process θt\theta_t.

In continuous time, a non-stationary Markov jump process is characterized by a (possibly time-dependent) generator (rate) matrix Q(t)Q(t) with

ddtP(t0,t)=P(t0,t)Q(t),P(t0,t0)=I,\frac{d}{dt}P(t_0, t) = P(t_0, t) Q(t), \qquad P(t_0, t_0) = I,

where each off-diagonal qij(t)0q_{ij}(t)\geq 0 and qii(t)=jiqij(t)q_{ii}(t) = -\sum_{j\neq i}q_{ij}(t) (Tiomela et al., 22 May 2025, Fischer et al., 9 Jun 2025).

Examples of Explicit Non-Stationarity

  • Time-dependent transition parameters: Pij(t)=fij(t)P_{ij}(t) = f_{ij}(t) with jfij(t)=1\sum_j f_{ij}(t)=1 for each tt (Tiomela et al., 22 May 2025).
  • Exogenous parameter-driven transitions: Pt(ss,a)=P(ss,a,θt)P_t(s'|s,a) = P(s'|s,a,\theta_t) with θt\theta_t a stochastic process, e.g., a Markov chain or random walk (Keplinger et al., 16 Jan 2025).

2. Modeling Frameworks and Classes

Markov Chains and Processes

Various classes arise from the specific structure of non-stationarity:

Class Defining Feature Key Reference
Piecewise-stationary Blocks of constant PtP_t, switching at change points (Keplinger et al., 16 Jan 2025)
Smoothly time-varying PtP_t drifts continuously with tt (Keplinger et al., 16 Jan 2025)
Periodic (cyclostationary) Pt=Pt+TP_t = P_{t+T} for some period TT (Fischer et al., 9 Jun 2025)
Exogenous parameter-driven θt\theta_t stochastic, Pt=P(,θt)P_t = P(\cdot, \theta_t) (Keplinger et al., 16 Jan 2025)
Path-dependent Markovian Transition rates depend on both nn and tt (Barraza et al., 5 Mar 2025)
Copula-based non-stationary Markov property encoded via time-varying copulas (Gobbi et al., 2017)
Switching MDP (SNS-MDP) Underlying unobserved mode θt\theta_t Markov chain (Amiri et al., 24 Mar 2025)

Non-stationarity may be abrupt (stepwise), continuous (drift), or periodic, with modeling choices depending on the dynamics under study (Keplinger et al., 16 Jan 2025).

3. Analytical Results and Computational Schemes

Chapman–Kolmogorov and Balance Systems

Non-stationary Markov processes obey time-dependent forward equations. For discrete time, balanced systems relate compartment counts via increments (e.g., ΔS=Δ10Δ1\Delta_{|S|} = \Delta_{10} - \Delta_1, etc.) (Tiomela et al., 22 May 2025). In continuous time, the Kolmogorov equation generalizes as: dπi(t)dt=jiπj(t)qji(t)πi(t)jiqij(t)\frac{d\pi_i(t)}{dt} = \sum_{j\neq i} \pi_j(t)q_{ji}(t) - \pi_i(t)\sum_{j\neq i}q_{ij}(t) or, for controlled settings, with explicit policy dependence (Tiomela et al., 22 May 2025, Fischer et al., 9 Jun 2025).

Limit Theorems and Long-run Behavior

Law of Large Numbers (LLN) and Central Limit Theorems (CLT) have been established for non-stationary Markov jump processes:

  • Under mild regularity, cumulative reward R(t)R(t) satisfies R(t)/E[R(t)]1R(t)/E[R(t)] \to 1 almost surely as tt\to\infty.
  • If transitions and rewards are periodic in tt, then the time-averaged reward converges to the periodic mean, and normalized fluctuations are asymptotically normal (Fischer et al., 9 Jun 2025).

For certain classes, explicit limit cycles or absorbing structures can arise, as in time-inhomogeneous chains with feedback or reinforcement (Awoniyi, 2023).

Performance Approximations

For slowly varying PtP_t, rigorous first-order corrections to stationary performance measures (e.g., discounted rewards, hitting times, expected occupation times) are derived via linear systems with perturbed matrices, providing O(ϵ)O(\epsilon)-accurate approximations with complexity identical to the stationary case (Zheng et al., 2018).

4. Stochastic Diffusion, Anomalous Dynamics, and Memory

Non-Stationary Anomalous Diffusion

Markovian replication processes (NMRP) on lattices, with time-dependent replication probability p(t)p(t), yield generalized telegrapher equations: ρt+R(t)2ρt2=D(t)2ρx2,\frac{\partial \rho}{\partial t} + \mathcal{R}(t)\frac{\partial^2\rho}{\partial t^2} = \mathcal{D}(t)\frac{\partial^2\rho}{\partial x^2}, with D(t)\mathcal{D}(t) and R(t)\mathcal{R}(t) determined by p(t)p(t) (Choi et al., 2017). Classification is governed by p(t)p(t)'s functional form—alternating, power-law, or marginal—producing a spectrum of diffusion behaviors (sub-, super-, or ultra-slow diffusion).

A further generalization introduces both state and time dependence in transition rates: λn(t)=β+γn1+ρt\lambda_n(t) = \frac{\beta + \gamma n}{1+\rho t} The dynamics balance a contagion term and a time-damping, with phase diagram (sub-, superdiffusive, ballistic, hyperballistic) indexed by H=γ/ρH=\gamma/\rho (Barraza et al., 5 Mar 2025). Non-stationarity is necessary for all regimes but the ballistic case.

Deviations from Gaussianity and violations of the classical CLT arise generically due to non-stationarity and autocorrelation (Barraza et al., 5 Mar 2025, Choi et al., 2017).

5. Algorithmic and Control Implications

Reinforcement Learning and Decision Processes

Non-stationarity in MDPs fundamentally impacts both policy structure and algorithm design:

  • Time-indexed value functions and Bellman recursions: Vt(s)V_t^*(s) and Qt(s,a)Q_t^*(s,a) are recomputed for each tt, requiring time-aware dynamic programming or Q-learning (Chen et al., 17 Nov 2025, Tiomela et al., 22 May 2025, Keplinger et al., 16 Jan 2025).
  • Switching environments: SNS-MDPs with latent Markovian mode switches retain TD-learning and Q-learning convergence due to ergodicity of the joint (θt,st)(\theta_t, s_t) process (Amiri et al., 24 Mar 2025).
  • Delayed reinforcement: In delayed MDPs, optimal policies must be non-stationary Markov (i.e., at=dt(st)a_t = d_t(s_t), not time-invariant), as stationary Markov policies can be strictly sub-optimal when delay m>0m>0 (Derman et al., 2021).
  • Algorithmic approaches: ASP(RL), hybridization with logical solvers, and smooth forgetting via exponential weights in value estimation support adaptation to evolving dynamics (Keplinger et al., 16 Jan 2025, Touati et al., 2020, Ferreira et al., 2017).

Practical Benchmarks

Simulation toolkits such as NS-Gym enable systematic benchmarking of algorithms on non-stationary environments, offering a modular framework for emulating parametric (e.g., periodic, abrupt, or drifting) evolution of underlying MDP parameters (Keplinger et al., 16 Jan 2025).

6. Statistical, Dynamical, and Nonparametric Models

Bayesian nonparametric models construct non-stationary Markovian dynamics on real-valued data without pre-imposed functional forms or stationarity assumptions. For example, transition densities can be specified via Dirichlet process mixtures of bivariate normals, yielding time-homogeneous but marginally non-stationary Markov models suitable for capturing evolving or heteroscedastic time series (DeYoreo et al., 2016).

Similarly, copula-based constructions facilitate both the representation and verification of β\beta-mixing (absolute regularity) under time-varying dependence parameters, with explicit bounds on mixing rates related to the maximal-correlation coefficients of the evolving copulas (Gobbi et al., 2017).

7. Applications and Empirical Insights

Non-stationary Markov process modeling is central to a range of empirical domains:

  • Epidemiological modeling: Time-varying compartment transition rates enable accurate simulation of disease waves, policy response, and resource allocation, outperforming stationary models which fail to capture non-equilibrium dynamics (Tiomela et al., 22 May 2025, Barraza et al., 5 Mar 2025).
  • Healthcare and system maintenance: Feedback-driven non-stationary Markov chains predict treatment or repair cycles, optimizing resource management in complex service systems (Awoniyi, 2023).
  • Queueing and service operations: Time-of-day or week-dependent rates require LLN/CLT development for performance analysis under realistic, fluctuating workloads (Fischer et al., 9 Jun 2025).
  • Communications and adaptive protocols: Switching MDPs capture network channels with Markovian mode-switching (e.g., due to fading), guiding robust protocol adaptation (Amiri et al., 24 Mar 2025).
  • Algorithmic robustness: Benchmark environments synthesized via NS-Gym, as well as theoretical regret bounds for non-stationary linear MDPs, illustrate the necessity of temporal adaptation and model update mechanisms (Keplinger et al., 16 Jan 2025, Touati et al., 2020).

Empirical evidence across these domains consistently demonstrates superior fidelity and policy efficacy when explicitly modeling or learning with non-stationary Markovian dynamics.


In summary, non-stationary Markov processes provide a canonical framework for representing, analyzing, and controlling complex systems in which time or exogenous factors drive structural shifts. Their mathematical characterization demands explicit temporal indexing or dynamic parameter evolution, and their effective deployment encompasses new algorithms, limit theorems, and empirical methodologies, all underpinned by a diverse and technically rigorous research literature (Tiomela et al., 22 May 2025, Amiri et al., 24 Mar 2025, Keplinger et al., 16 Jan 2025, Choi et al., 2017, Awoniyi, 2023, Barraza et al., 5 Mar 2025, Fischer et al., 9 Jun 2025, Ferreira et al., 2017, Zheng et al., 2018, Chen et al., 17 Nov 2025, Derman et al., 2021, DeYoreo et al., 2016, Touati et al., 2020, Gobbi et al., 2017).

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Non-Stationary Markov Process.