Papers
Topics
Authors
Recent
Search
2000 character limit reached

Online Traveling Repairperson Problem

Updated 22 January 2026
  • Online TRP is a dynamic routing and scheduling problem extending the classical TSP to handle requests arriving over time, aiming to minimize average completion times.
  • Deterministic and randomized algorithms, like the MIMIC framework, achieve competitive ratios as low as 2.821 in specific metric spaces using prize-collecting schedules.
  • Learning-augmented variants integrate predictive insights to improve consistency, smoothness, and robustness, making the approach applicable to real-world network restoration.

The Online Traveling Repairperson Problem (OTRP), or Online TRP, is a central scheduling and routing problem that generalizes the classical Traveling Salesperson Problem to dynamic settings where requests arrive over time. The central objective is to control a server that moves in a metric space serving requests as they are revealed, always aiming to minimize the sum (or average) of completion times. The OTRP is the canonical "average response time" analog to the classical "makespan" objective of online vehicle routing and has spawned a substantial literature covering deterministic online bounds, learning-augmented variants, and real-world network restoration under uncertainty.

1. Formal Problem Definition and Models

Let (M,d)(M,d) be any metric space, with a designated origin OM\mathsf O\in M. An initially unknown sequence Q={qi}i=1nQ = \{q_i\}_{i=1}^n of requests is released over time. Each request qi=(xi,ti)q_i = (x_i, t_i) contains its location xiMx_i\in M and a release time ti0t_i\ge0 (unknown until tit_i). A server starts at O\mathsf O at time $0$, moves at unit speed, and can "wait" at zero speed.

The central cost is the weighted sum of completion times: for each qiq_i, let CiC_i be the time at which the server first visits xix_i after tit_i. Then,

Total Cost=i=1nwiCi\text{Total Cost} = \sum_{i=1}^n w_i C_i

with wi0w_i\geq0 (usually wi=1w_i=1). The goal is to minimize this sum.

Variants:

  • Closed TRP: The server must return to the origin after serving all requests.
  • Open TRP: No return to the origin is required.

The standard performance metric is the competitive ratio: for an online algorithm ALG,

$\CR(\text{ALG}) = \sup_I \frac{\text{ALG}(I)}{\text{OPT}(I)}$

where ALG(I)\text{ALG}(I) is the algorithm's cost on instance II, and OPT(I)\text{OPT}(I) is the offline optimum, which knows all requests in advance.

Specialized versions have been studied on the line, in arbitrary (finite or infinite) metric spaces, and on structured topologies such as trees, rings, and flowers. Learning-augmented settings incorporate predictions about locations (but not release times) via machine learning (Bampis et al., 2023, Guragain et al., 20 Jan 2026).

2. Deterministic Online Algorithms and Lower Bounds

In the purely online setting (no predictions), the competitive ratio for deterministic algorithms is tightly bounded depending on the underlying metric:

  • On arbitrary metrics: the state-of-the-art deterministic upper bound is $4$, established by the MIMIC algorithm, which iteratively executes prize-collecting auxiliary schedules and resets, as analyzed via a phase-based routine and a factor-revealing LP (Bienkowski et al., 2021).
  • On the line: a lower bound of 1+22.4141+\sqrt{2}\approx2.414 holds [Feuerstein-Stougie 2001], with the best deterministic upper bound also $4$ (Bienkowski et al., 2021). Recent work shows an improved lower bound of $3$ even in learning-augmented models, by demonstrating that any deterministic algorithm must, in worst-case adversarial scenarios, incur a factor of $3$ compared to optimum (Guragain et al., 20 Jan 2026).
  • On some graph topologies (expanded trees, rings): algorithms benefit from exploiting topology, reducing the number of candidate routes that must be considered (Bampis et al., 2023).

Randomization can further improve the bounds: phase-randomized extensions of MIMIC achieve a competitive ratio of 1+2/ln3<2.8211 + 2/\ln3 < 2.821 for arbitrary metrics (Bienkowski et al., 2021).

3. Learning-Augmented Online TRP

The learning-augmented framework equips the algorithm with request-location predictions P={pi}i=1nP = \{p_i\}_{i=1}^n, provided by an oracle or ML predictor, while the true release times tit_i remain adversarial and unknown. The key challenge is to design algorithms whose performance degrades gracefully with prediction error, measured either as normalized sum-of-location errors (η\eta) or as a maximum absolute position error (δ\delta).

Error-aware measures:

  • Consistency (α\alpha-consistent): If predictions are perfect (η=0\eta=0), the algorithm achieves a competitive ratio α\alpha.
  • Smoothness (γ\gamma-smooth): As error increases, the competitive ratio increases smoothly as a function γ(η)\gamma(\eta), typically linear.
  • Robustness (β\beta-robust): For arbitrary, possibly adversarially bad predictions, performance remains within β\beta times the offline optimum.

Results for the line (Guragain et al., 20 Jan 2026):

  • Perfect prediction yields a deterministic competitive ratio of 2+33.7322+\sqrt{3}\approx3.732.
  • Under prediction error δ\delta, the competitive ratio is at most min{3.732+4δ,4}\min\{3.732+4\delta,4\}.
  • These are the first such learning-augmented online TRP bounds for the line.

Results for general metrics (Bampis et al., 2023):

  • With perfect prediction, deterministic algorithms achieve α=3/2\alpha=3/2-consistency.
  • Linear smoothness: ratio at most 3/2+5η3/2+5\eta.
  • Robustness: $2.75$ (closed case) and 2.8332.833\ldots (open case) for general metrics; $2.5$ and 2.6672.667\ldots for trees/Euclidean spaces. All tight.
  • Efficient algorithms require only single-exponential time in nn in general, or FPT/polynomial time in rings, trees, or flowers due to combinatorial structure exploitation.

4. Algorithmic Frameworks and Methodologies

Prize-Collecting Schedules and MIMIC

The MIMIC algorithm (Bienkowski et al., 2021) is a phase-based routine that partitions time into exponentially growing intervals. In each phase, it computes an auxiliary prize-collecting schedule that balances early completions against penalties for deferred requests. The analysis leverages factor-revealing linear programs to tightly bound the competitive ratio.

Oracle-Based and Learning-Augmented Routing

The SWAG\mathtt{SWAG} and LA-SWAG\mathtt{LA\text{-}SWAG} frameworks (Bampis et al., 2023) generalize MIMIC by introducing an oracle that, at each decision time, provides a compact set of dominating permutations (routes) for candidate request sets. These oracles are tractable on special metric spaces, e.g., rings, trees, and flowers (for bounded parameters). Learning-augmented variants route the server using predicted locations, waiting at each predicted point for the actual request to be revealed, and flexibly fallback to optimal cleanup tours if circumstances change.

Robustification under Imperfect Predictions

To ensure robustness, especially when location predictions are error-prone, algorithms adapt by appropriately inflating the service regions (e.g., enlarged round-trip radii on the line) or shifting predicted locations pessimistically (Guragain et al., 20 Jan 2026). For sufficiently high error, the algorithm degrades gracefully to the performance of the best non-augmented method.

5. Network Restoration and Partially Observed Variants

The TRP framework has been adapted to account for real-world uncertainty, as in infrastructure network restoration contexts (Biswas et al., 8 May 2025). Here, the problem incorporates incomplete information about node fault status and dynamic information revelation:

  • The underlying infrastructure is modeled as a tree-structured network with dependent failure propagation.
  • The crew operates in a complete road network, and at each decision epoch, uncertainty about node status is resolved only upon visitation.
  • The sequential routing problem is modeled as a finite-horizon Markov decision process with a high-dimensional, partially observed state space.

To address computational intractability, a combination of approximate dynamic programming (value function approximation with reinforcement learning), structural pruning based on dominance and k-optimality, and multi-level state aggregation is employed. Computational experiments demonstrate near-optimal performance and substantial speedup relative to myopic heuristics.

6. Summary of Theoretical and Algorithmic Results

Setting Deterministic CR Randomized CR Learning-Augmented Consistency Robustness (max CR)
Arbitrary Metric (Bienkowski et al., 2021) 4 $2.821$ $3/2$ (Bampis et al., 2023) $2.75$–$2.833$ (Bampis et al., 2023)
Line (Guragain et al., 20 Jan 2026) 3\geq3–$4$ $2.821$ (gen.) $3.732$ (perfect pred.) min{3.732+4δ,4}\min\{3.732+4\delta,4\}
Trees/Rings/Flowers (Bampis et al., 2023) $3/2$ $3/2$ $2.5$–$2.667$

A plausible implication is that, with high-quality predictions, learning-augmented OTRP algorithms strictly surpass all known deterministic online lower bounds for the response time objective, while maintaining robustness guarantees even under adversarial prediction failures. The structural properties of specific metrics (trees, rings, "flowers") facilitate tractable or even FPT algorithms exploiting permutation dominance.

7. Open Problems and Future Directions

Key challenges remain in the OTRP domain:

  • Extending learning-augmented competitive ratios below classical $4$ for general metrics with limited prediction error (Guragain et al., 20 Jan 2026).
  • Constructing robust algorithms that do not require advance knowledge of the prediction error parameter δ\delta.
  • Evaluating the effectiveness of randomized or adversarially resilient learning-augmented algorithms.
  • Establishing tight lower bounds for deterministic and randomized learning-augmented OTRP beyond the line.
  • Applying these frameworks to large-scale, real-world, and stochastic infrastructure networks, accounting for uncertainty in both travel and system structure (Biswas et al., 8 May 2025).

These directions highlight connections between online routing, scheduling, reinforcement learning, and network optimization, suggesting a wide applicability for OTRP methods across theoretical and applied domains.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Online Traveling Repairperson Problem.