Judgelight: Optimized Trajectories for MAPF
- Judgelight is a trajectory-level post-optimization framework for MAPF that removes redundant and oscillatory agent movements while upholding collision constraints.
- It leverages an ILP-based collapse operation to systematically prune closed subwalks, ensuring lower-cost and feasible schedules.
- Empirical evaluations reveal cost reductions of up to 40% with near real-time performance, making it ideal for warehouse automation and multi-robot coordination.
Judgelight is a trajectory-level post-optimization framework designed to improve the quality of Multi-Agent Path Finding (MAPF) schedules by removing redundant or oscillatory agent movements while maintaining all collision and feasibility constraints. It operates as a solver-agnostic layer, taking as input any feasible MAPF solution and systematically collapsing closed subwalks within agent trajectories to yield lower-cost, higher-quality schedules especially well-suited for deployment in domains such as warehouse automation and multi-robot coordination (Tang et al., 27 Jan 2026).
1. Multi-Agent Path Finding and Motivation
MAPF seeks collision-free paths for agents on an undirected graph , with each agent moving from its start to its goal over a discrete time horizon . A feasible MAPF schedule is a matrix in which the following conditions are satisfied:
- and , with each agent remaining at its goal after arrival.
- Agent step transitions are either via edges or self-loops: .
- Absence of vertex collisions ( for ) and edge swaps ( for ).
Learning-based MAPF solvers provide scalable approximate solutions but often introduce unnecessary or oscillatory agent movements, particularly under high congestion, leading to increased energy, execution time, and mechanical wear. Judgelight targets these inefficiencies by post-processing solver output, collapsing sections of trajectories that represent redundant motion.
2. Closed Subwalks and the Collapse Operation
A subwalk within agent 's trajectory is closed if . The collapse operation replaces all for with , effectively pruning excursions that leave and then return to the same vertex, provided this does not induce collisions. Given the MAPF move cost structure (unit cost per move, zero cost for waits/self-loops), collapse actions strictly reduce trajectory cost without compromising path feasibility if applied compatibly.
3. Problem Formalization and Computational Complexity
The MAPF-Collapse problem is defined as follows: Given , select a set of disjoint collapse actions such that the modified schedule remains feasible and its cost
is minimized.
MAPF-Collapse is proven NP-hard. The decision version—checking if collapses exist to reach cost —is in NP and is established as NP-complete by reduction from the Independent Set problem. The reduction encodes vertices as agents whose possible long-range collapses encode vertex selection, and edges as agents structured to enforce mutual exclusion, precisely reflecting independent set constraints (Tang et al., 27 Jan 2026).
4. ILP Formulation and Algorithmic Pipeline
Judgelight solves the MAPF-Collapse problem exactly by constructing an integer linear program (ILP) over the candidate set
Each candidate collapse has weight equal to the number of moves converted to waits, and a binary decision variable indicating application. The objective is to maximize , equivalently minimizing the final schedule cost. The ILP encodes:
- Mutual exclusion for overlapping collapses on the same agent: for overlapping intervals.
- Cross-agent exclusion for collapses at the same node/time: for intervals overlapping in time at the same vertex.
- Collision-avoidance dependencies ensuring that, if brings agent onto at time overlapping with agent at , then must also leave via some collapse covering .
- Feasibility constraints: Infeasible collapses (for which necessary dependencies cannot be enforced) are inactivated.
Preprocessing steps aggressively prune collapse candidates: persistent constant segments are collapsed wholesale; “ABA→AAA” filtering targets local oscillations, reducing combinatorial candidate explosion. All indices (per-agent time lists, segment trees) are constructed to efficiently build constraint graphs (Tang et al., 27 Jan 2026).
5. Empirical Evaluation and Performance Characteristics
Judgelight was tested across 3,296 cases on the POGEMA MAPF benchmark with both search-based and learning-based solvers, including LaCAM, SCRIMP, DCC, RAILGUN, MAMBA, and Follower. The key observations are:
- Average solution cost (SoC) is reduced by 20%–40% on learning-based solvers, with pronounced savings on methods prone to oscillations (e.g., MAMBA, Follower).
- Even for search-based solvers (LaCAM) which approach optimality, Judgelight achieves 10%–15% reductions by eliminating residual path inefficiencies.
- ILP solve times are nearly real-time in typical scenarios: for all test cases where all agents reach their goals (ISR=1), Judgelight solves the ILP in under 1 s in more than 90% of instances. The vast majority of LaCAM, SCRIMP, RAILGUN, and Follower instances are below this threshold, with some degradation at extreme agent densities.
| Solver | ISR ≥ 0 | ISR = 1 |
|---|---|---|
| LaCAM | 99.70% | 99.88% |
| SCRIMP | 84.44% | 98.44% |
| DCC | 52.31% | 88.94% |
| MAMBA | 26.70% | 98.26% |
| RAILGUN | 81.89% | 99.52% |
| Follower | 88.29% | 100.00% |
6. Deployment Implications and Limitations
By pruning redundant motion, Judgelight offers several concrete benefits for MAPF deployments:
- Reduced execution time and flowtime.
- Lower energy and battery consumption.
- Decreased mechanical wear due to the elimination of unnecessary reversals and dead-end excursions.
- Reduced unpredictability, thereby improving overall system safety.
The primary limitation is the ILP formulation’s scaling with agent density and number of variable candidates, a problem partly addressed through robust preprocessing but not provably eliminated in the worst case. The approach also assumes a zero cost for agent waits; in real deployments, nonzero wait penalties or more complex cost models may be warranted.
7. Extensions and Future Directions
Judgelight’s general methodology—trajectory-level post-optimization via closed subwalk collapsing—is directly extensible to richer MAPF variants:
- Incorporation of TAPF and LMAPF where agent goals may be reassigned or varied temporally.
- Support for alternative dynamics such as rotations or more expressive motion primitives.
- Approximate or online collapse heuristics targeting scalability with very large-scale fleets.
- Cost models accounting for both movement and waiting, potentially via joint optimization or integration within end-to-end learning-based pipelines.
A plausible implication is that solver-agnostic post-optimization of this form can become a standard step in MAPF pipelines, particularly for real-time learning-based methods whose raw outputs exhibit structured inefficiency (Tang et al., 27 Jan 2026).