Throughput-Optimal Scheduling Algorithms

Updated 18 February 2026

Throughput-optimal scheduling algorithms are defined to stabilize network queues for arrival rates strictly within capacity limits using Max-Weight and Lyapunov drift analysis.
They employ techniques such as hierarchical modulation and rate-learning to maximize service rates and enhance performance across wireless, switch, and AI workload systems.
Recent extensions incorporate dynamic channel feedback, index policies, and distributed scheduling to address interference, delay challenges, and evolving network demands.

Throughput-optimal scheduling algorithms are a foundational class of policies in queueing and network control, defined by the property that they stabilize all queues for every arrival rate vector lying in the interior of the system’s capacity (or rate) region. These algorithms have been instrumental across several domains, including wireless networks, switch architectures, multi-hop transportation systems, and contemporary AI inference workloads. Key methodological frameworks—such as Max-Weight scheduling, Lyapunov-drift analysis, index policies, and optimization-based techniques—have been generalized and refined to address topology, interference, information delay, physical-layer constraints, and emerging application-specific requirements.

1. Foundational Principles and Definitions

A scheduling policy is throughput-optimal if, for any arrival-rate vector strictly interior to the network's capacity region, the corresponding queueing network is stochastically stable (the time-averaged expected queue lengths are uniformly bounded). This concept was formalized in Tassiulas and Ephremides’ Max-Weight Scheduling paradigm, wherein at each slot the policy selects from the set of feasible schedules so as to maximize the weighted sum of current queue lengths times service rates (Promponas et al., 2024). Through Lyapunov drift analysis, such policies are shown to ensure negative expected drift outside bounded subsets, yielding positive recurrence of the Markov chain describing network queues.

In mathematical terms, for queues $Q_i(t)$ with arrivals $A_i(t)$ and control $\sigma(t)$ , a throughput-optimal policy ensures

$\limsup_{T\rightarrow\infty} \frac{1}{T} \sum_{t=0}^{T-1} \mathbb{E}[Q_i(t)] < \infty \quad \forall i, \ \text{whenever} \ \lambda \ \text{is strictly inside} \ \Lambda,$

where $\Lambda$ is the network capacity region (Karaca et al., 2012, Promponas et al., 2024).

2. Canonical Algorithms, Optimality, and Rate Region Expansion

Max-Weight Scheduling (MWS):

The classic Max-Weight algorithm computes in each slot

$\max_{s\in \mathcal{S}} \sum_i Q_i(t)s_i,$

with $\mathcal{S}$ the set of feasible schedules, ensuring stabilization for all interior rates of $\text{conv}(\mathcal{S})$ (Promponas et al., 2024). Numerous architectures use variants of the Max-Weight principle. For example, in wireless systems with fading, Max-Weight with channel-aware weights is throughput-optimal (Karaca et al., 2012).

Hierarchical Modulation Schedulers:

By embedding hierarchical modulation and simultaneously serving users with divergent channel conditions, scheduling policies can provably enlarge the achievable rate region. For instance, Max-Weight with Hierarchical Modulation (MWHM) maximizes a composite queue-weighted rate subject to power constraints across modulation layers, yielding a strictly larger achievable region $\Lambda_{hm} \supset \Lambda_{um}$ compared to single-user Max-Weight Uniform Modulation (MWUM) (Karaca et al., 2012). The Lyapunov-analysis-based proof demonstrates that in every slot, the HM-based policy’s weighted service dominates the UM-based policy.

Beyond Queue-Weight Dependence: Rate-Learning and SYL:

Recent advances such as the "Schedule as You Learn" (SYL) framework replace direct queue-based Max-Weight control with a dual-averaging-based learning of the optimal service-rate vector, and implement randomized scheduling to realize this average rate. The result is a throughput-optimal scheduler with decoupling from instantaneous backlog, which allows embedding further flexible criteria (e.g., priorities, fairness) without compromising stability (Promponas et al., 2024).

Algorithm Class	State Used	Throughput-Optimal	Reference
Max-Weight	Queue-lengths	Yes	(Promponas et al., 2024)
MWHM	Queues, CSI	Yes	(Karaca et al., 2012)
SYL (Rate-Learning)	Arrivals only	Yes	(Promponas et al., 2024)

3. Extensions for Channel, Feedback, and Interference Models

Dynamic Channel Feedback and Overhead:

Practical wireless systems incur channel state acquisition overhead. Throughput-optimal algorithms like SDF integrate the time cost of channel probing directly, dynamically reducing probing to only those users whose rates may impact the optimal schedule. This ensures the stability region is strictly expanded beyond the "probe all" regime, with explicit $\epsilon$ -gain analytically derived (Karaca et al., 2012).

Correlated Channels and Index Policies:

When channel states are Markovian and only observed via ARQ feedback, optimality can be achieved by combining Whittle’s index for channel exploration with queue-length weights. The policy selects users maximizing $q_iW_i(\pi_i)$ , where $W_i(\pi)$ is the Whittle index for belief state $\pi$ , yielding low complexity with robust optimality under transmission constraints (Ouyang et al., 2012).

Distributed and Low-Information Schedulers:

Distributed, throughput-optimal algorithms have been developed under strong physical interference models (e.g., SINR), typically forgoing network topology knowledge. For instance, the "Reflect" algorithm achieves constant-factor efficiency solely based on local rate estimates, with the stability margin independent of network size (Asgeirsson et al., 2012). In ultra-constrained settings (e.g., IoT), single-bit-feedback policies can achieve throughput-optimality for certain conflict graphs, with Lyapunov drift arguments customized to queue-nonemptiness–only information (Mohan et al., 2020).

4. Architectural and Application-Specific Advances

Crossbar and Switch Scheduling:

Node-weighted scheduling algorithms such as Maximum Vertex-weighted Matching (MVM) and Lazy Heaviest Port First (LHPF) achieve throughput-optimality and, in certain cases, clearance-time optimality—minimizing the time to drain all packets after arrivals end (0902.1169). Node-based Service-Balanced (NSB) policies further balance throughput and evacuation time, with explicit approximation factors for general topologies and full optimality in bipartite graphs (Sang et al., 2015).

LLM and AI Inference Systems:

Recent scheduling theory has been applied directly to LLM inference. In single-server settings with batched service (tokens per batch), any work-conserving batching policy achieves maximal throughput, with the throughput region determined by the token budget and batch processing rate (Li et al., 10 Apr 2025). In multi-phase systems, optimality requires allocation between prefill and decode workloads and dynamic tiling for GPUs; Resource-Aware Dynamic (RAD) schedulers implementing these achieve system capacity under mild conditions (Bari et al., 1 Aug 2025).

5. Variants: Regularity, Delay, and Multi-Objective Policies

Standard throughput-optimal policies can exhibit poor delay or high variance in inter-service intervals. Regular Service Guarantee (RSG) algorithms incorporate time-since-last-service (TSLS) into the weight structure, balancing queue stability with regular service periods. The RSG algorithm can approach the fundamental lower bound of inter-service time regularity within a constant factor, while remaining throughput-optimal for all arrival vectors strictly inside the capacity region (Li et al., 2014).

Policies with heterogeneously delayed state information use the freshest common information to minimize per-packet delay without sacrificing throughput-optimality; efficient approximate algorithms (e.g., LC-ELDR, LC-ERDMC) for these settings achieve near-optimal stability with orders-of-magnitude lower computation (Narasimha et al., 2015).

6. Analytical and Algorithmic Foundations

The dominant tool for proving throughput-optimality is Lyapunov drift analysis, often in quadratic or other convex forms, tailored to the algorithm’s state and action space (Karaca et al., 2012, Promponas et al., 2024, Karaca et al., 2012, Li et al., 2014). Variants include fluid-model techniques for multi-hop systems (Sang et al., 2015), stochastic approximation for distributed scheduling (Jiang et al., 2010), and dual-averaging for learning-based policies (Promponas et al., 2024).

Given the combinatorial intractability of exact Max-Weight solution in general networks, frameworks utilizing interactive optimization oracles (random search, Glauber dynamics, BP, primal-dual methods) have shown that “one iteration per slot” is sufficient, under suitably slowly-varying weights, to guarantee throughput-optimality, providing a unified lens for known algorithms and guiding new design (Shin et al., 2014).

7. Performance, Limitations, and Emerging Directions

Throughput-optimality does not guarantee optimal delay, fairness, or regularity. Several advanced policies—such as RSG, SYL, and index-based approaches—explicitly address these, while maintaining provable stability. Structured policies for AI inference combine queuing optimality with real-world Service Level Objective (SLO) constraints, such as TTFT and TBT (Bari et al., 1 Aug 2025). Distributed and low-information algorithms make these assurances practical in large, resource-constrained, or dynamically evolving networks (Asgeirsson et al., 2012, Mohan et al., 2020).

Rapid practical advances are occurring in networked and heterogeneous systems, driving further theoretical interest in robust throughput-optimal algorithms under adversarial traffic, time-varying topology, and more stringent delay or regularity requirements. The theoretical underpinnings continue to expand, guided by the foundational principles outlined here.