Regression-Based Engine Scheduling Heuristic

Updated 14 December 2025

The paper introduces a hybrid regression-based scheduling approach that leverages deep neural network regressors as fast surrogates to approximate total tardiness.
It integrates Lawler’s decomposition with a single-layer LSTM regressor, enabling a recursive, single-pass scheduling scheme that mitigates the NP-hard exponential search space.
Empirical results demonstrate near-optimal performance (optimality gap ~0.5% for up to 325 jobs) with practical O(n^3) runtime compared to traditional heuristics.

A regression-based engine-scheduling heuristic is a hybrid algorithmic framework designed to solve the classical NP-hard single-machine total-tardiness problem by leveraging deep neural network regressors as fast, polynomial-time surrogates for exact combinatorial evaluation. The approach integrates domain knowledge from classical decomposition (specifically, Lawler’s theorem) with a learned estimator to guide recursive, single-pass scheduling. This paradigm achieves near-optimal scheduling performance on challenging instances (up to approximately 350 jobs), outperforming traditional heuristics while maintaining practical computational efficiency (Bouška et al., 2020).

1. Problem Formulation and Decomposition

The single-machine total-tardiness problem consists of scheduling $n$ independent, non-preemptive jobs $J=\{1,\ldots, n\}$ , each characterized by an integer processing time $p_j > 0$ and due date $d_j \geq 0$ . A schedule $\pi$ is a permutation of $J$ , with completion time for job $j$ in $\pi$ given by

$C_{j}(\pi) = \sum_{i=1}^{k(j)} p_{\pi(i)},$

where $k(j)$ is the position of $j$ in $\pi$ . The tardiness $T_j(\pi)$ is

$T_j(\pi) = \max\{0, C_j(\pi) - d_j\}$

and the objective is to minimize total tardiness:

$T(\pi) = \sum_{j=1}^n T_j(\pi).$

Lawler’s decomposition theorem structurally constrains optimal schedules. Specifically, the job $j^p$ with maximal $p_j$ in $J$ can only be placed at positions $k \geq pos^0$ in EDD (Earliest Due Date) order, where $pos^0$ is the rank of $j^p$ in EDD. Symmetrically, in SPT (Shortest Processing Time) order, only certain positions for the job $j^d$ with the smallest $d_j$ are considered. Recursively, for each valid $k$ , the problem splits into two subproblems (left/right of $k$ ), and the overall optimal value is computed as

$Z(J) = \min_{k \in K_\text{EDD}} \left[ Z(J_\text{left}) + \max(0, (\sum_{j \in J_\text{left}} p_j + p_{j^p}) - d_{j^p}) + Z(J_\text{right}) \right].$

This recursion naturally induces an exponential tree if all branches are explored optimally.

2. Regression Model Architecture and Training

To circumvent the exponential cost of recursive evaluation, a deep neural network regressor $\widehat{Y}(\cdot)$ is introduced as a surrogate for the exact evaluation $Z(\cdot)$ .

Input Features and Data Preprocessing: Jobs are first sorted in EDD order. Features for each job $j$ consist of the normalized processing time $p_j/S$ , normalized due date $d_j/S$ (with $S = \sum_{j\in J} p_j$ ), and a positional feature $\alpha_j$ (its EDD rank divided by $|J|$ ). Thus, each input is the sequence $[p_j, d_j, \alpha_j]_{j \in J}$ .
Network Design: A single-layer LSTM with 512 hidden units processes the sequence of job vectors. The final hidden state passes through a fully connected layer with linear activation to yield $y_\text{norm}\in \mathbb{R}$ . De-normalization multiplies $y_\text{norm}$ by $S$ to output the estimated total tardiness $\widehat{Y}(J)$ .
Training: Instances are generated using the Potts–Van Wassenhove scheme: parameters include job count $n$ , due-date range $R$ , tardiness-factor $T$ , and $p_\text{max}$ . Training labels are exact solutions $Z(J)$ generated by a state-of-the-art DP/branch-and-reduce solver (TTBR), normalized by $S$ . Optimization minimizes mean squared error with Adam (learning rate $10^{-4}$ ), and early stopping (patience=5) is applied.

3. Integration with Single-Pass Scheduling

The learned regressor is embedded within a single-pass greedy recursive scheduler based on Lawler’s decomposition. At each recursion:

EDD- and SPT-eligible position sets ( $K_\text{EDD}$ , $K_\text{SPT}$ ) are constructed, filtered by dominance rules.
The smaller set is selected for branching.
For each position $k$ in the set, the schedule is split: jobs to the left ( $J_L$ ), the selected job ( $j_\text{sel}$ ), and jobs to the right ( $J_R$ ).
The cost is estimated as

$\text{cost} = \widehat{Y}(J_L) + \text{penalty} + \widehat{Y}(J_R),$

where $\text{penalty} = \max(0, C_L + p_{j_\text{sel}} - d_{j_\text{sel}})$ .

For $|J| \leq \kappa$ (with $\kappa=5$ ), the exact solver is called. Otherwise, the procedure selects $k$ minimizing the estimated cost and recursively schedules left and right subproblems. Neural net predictions are cached to prevent redundant inference.

4. Computational Complexity

The time complexity of the heuristic (labeled “dhs” in the source) arises from the following structural properties:

Each call to $\widehat{Y}(J')$ executes a single LSTM pass over $|J'|$ jobs ( $\mathcal{O}(n)$ ).
Each recursion evaluates all $|K| = \mathcal{O}(n)$ legal placements, each requiring two regressor calls, so each level is $\mathcal{O}(n^2)$ .
With $n$ jobs and at most one removed per recursion, the recursion depth is $O(n)$ .

Thus, overall worst-case runtime is $O(n^3)$ . Empirically, with practical implementation and modern hardware, the method is efficient even for $n \approx 350$ with seconds of runtime.

5. Empirical Results and Benchmarking

Extensive experiments validate the effectiveness and scalability of the regression-based heuristic. The evaluation spans varying $n$ (up to 500), $p_\text{max} \in \{100,5000\}$ , $R,T \in \{0.2,0.4,0.6,0.8,1.0\}$ , with 200 random instances per setting. Competing methods include:

NBR: classical greedy/local-exchange [Holsenback & Russell, 1992].
DHS $_\text{nbr}$ : uses same decomposition but substitutes NBR for the neural regressor.
TTBR $^{10}$ : exact DP/branch-and-reduce with 10 s time limit.

Key metrics are the optimality gap (percentage over optimal) and CPU runtime.

$n$	NBR gap [%]	DHS $_\text{nbr}$ gap [%]	DHS $_\text{NN}$ gap [%]	NBR time [s]	DHS $_\text{nbr}$ time [s]	DHS $_\text{NN}$ time [s]
225	$1.98\pm0.58$	$1.17\pm0.47$	$0.58\pm0.30$	$0.06\pm0.01$	$1.19\pm0.42$	$5.03\pm8.16$
275	$2.12\pm0.54$	$1.31\pm0.44$	$0.57\pm0.28$	$0.09\pm0.02$	$1.91\pm0.62$	$6.89\pm9.62$
325	$2.20\pm0.50$	$1.39\pm0.43$	$0.57\pm0.37$	$0.12\pm0.02$	$2.87\pm0.90$	$9.25\pm11.29$
375	$2.27\pm0.49$	$1.46\pm0.44$	$1.23\pm0.63$	$0.17\pm0.03$	$4.15\pm1.31$	$14.61\pm13.52$
425	$2.34\pm0.46$	$1.55\pm0.41$	$1.71\pm0.65$	$0.21\pm0.04$	$5.52\pm1.71$	$20.60\pm17.00$

For $n \leq 325$ , the regression-based heuristic (DHS $_\text{NN}$ ) achieves an optimality gap of approximately $0.5\%$ , about four times better than NBR. Runtime grows as $O(n^3)$ but remains practical (<20 s for $n\leq425$ ). When increasing $p_\text{max}$ to 5000, the heuristic’s gap remains stable, while TTBR $^{10}$ and NBR degrade.

6. Implementation Considerations

Efficient implementation requires:

Use of modern DL frameworks (TensorFlow, PyTorch) for the LSTM regressor.
On-the-fly computation of $S$ and EDD-sorted lists within the schedule.
Aggressive caching of regressor predictions ( $\widehat{Y}(J)$ ) for subproblem reuse.
Tuning of the exact-solver threshold $\kappa$ ( $\kappa=5$ achieves a robust tradeoff).
Parallelization is feasible, especially for GPU-based inference.

7. Significance and Outlook

This regression-based engine-scheduling heuristic demonstrates that neural regressors—when judiciously embedded within classical decomposition frameworks—yield scalable, high-quality solutions for combinatorial scheduling problems. The approach occupies a hybrid space between exact DP/branch-and-reduce and purely handcrafted heuristics, offering polynomial runtime, strong optimality guarantees on moderate-size instances, and generalization to instance regimes beyond those seen in training (Bouška et al., 2020). A plausible implication is that similar regression-guided heuristics may be extensible to other sequencing and vehicle-routing variants where classical decompositions and learned cost proxies can be integrated systematically.

Markdown Report Issue Upgrade to Chat

References (1)

Data-driven Algorithm for Scheduling with Total Tardiness (2020)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Regression-Based Engine-Scheduling Heuristic.