Interpretation of Regret Tiebreaker’s Subtle Throughput Gains in Lifelong MAPF

Determine whether one-step optimization of the regret-based tiebreaker in Priority Inheritance with Backtracking (PIBT) correlates with throughput improvement in lifelong multi-agent pathfinding (MAPF), and ascertain the reason why the regret-based tiebreaker yields subtler throughput gains than the hindrance-based tiebreaker in the reported experiments.

Background

The paper proposes two lightweight tiebreaking terms for PIBT—hindrance and regret—to improve solution quality and scalability in large-scale MAPF. Hindrance measures whether an action will hinder a neighbor’s progress next timestep, while regret is learned over multiple PIBT runs to estimate how an action causes regret (distance suboptimality) in others.

In lifelong MAPF experiments, the authors report that hindrance consistently improves throughput significantly, whereas regret shows steadier but subtler improvements. They explicitly state they lack a solid interpretation of this discrepancy and speculate that one-step regret optimization may not strongly correlate with throughput, motivating a focused investigation of the relationship between regret learning and throughput outcomes.

References

We do not have solid interpretations for this; perhaps a one-step optimisation of the regret values would not correlate strongly with the throughput improvement.

Lightweight and Effective Preference Construction in PIBT for Large-Scale Multi-Agent Pathfinding  (2505.12623 - Okumura et al., 19 May 2025) in Section “Lifelong MAPF” (Experiments)