Unit-Load Pre-Marshalling Problem
- UPMP is a complex optimization problem that involves reordering unit loads in grid-based warehouses to prevent high-priority items from being blocked.
- Solution methods such as A* search with structured heuristics and min-cost flow assignments are employed to address its NP-hard complexity.
- Empirical findings show that using multiple access directions and LLM-generated heuristics can significantly reduce relocation moves and improve operational efficiency.
The Unit-Load Pre-Marshalling Problem (UPMP) is a central optimization problem in block-stacking warehouse logistics, particularly in automated and high-density storage systems. It generalizes the container pre-marshalling problem (CPMP) from maritime terminals to warehouses containing unit loads such as pallets, cartons, or totes, often managed by autonomous mobile robots (AMRs). The UPMP focuses on efficiently reordering unit loads within a storage grid during off-peak periods so that, upon retrieval, no unit with a higher priority is blocked by a lower-priority unit, thereby minimizing retrieval delays and total reshuffling effort. The problem is combinatorially challenging, strongly NP-hard in its core forms, and encompasses variants involving multiple access directions, multi-tier and multi-bay layouts, as well as practical constraints such as move costs, weight limits, and multi-agent synchronization.
1. Formal Definition and Mathematical Structure
The UPMP is defined over one or more bays, each as a 2D or 3D grid of storage slots. Let denote the number of columns (stacks), the fixed stack height (number of tiers), and the set of stacks. Each unit load occupies a position —stack and tier —with a priority ; smaller indicates higher retrieval priority. The state of the system is represented as .
A feasible move consists of relocating the topmost unit from stack to the (non-full) top of stack . The transition has unit cost. The objective is to find a sequence of moves to a goal state—in which, for all stacks and tiers , any pair of occupied slots satisfies (i.e., priorities non-decreasing from top to bottom)—minimizing the total number of relocations: Multi-tier, multi-access, and multi-bay extensions index position in three dimensions and introduce access direction assignments for each stack. Variants further incorporate time/distance-dependent move costs and operational side constraints (Bömer et al., 27 Jan 2026, Pfrommer et al., 2024, Pfrommer et al., 2022, Bömer et al., 5 Mar 2025).
2. Structural Properties, Variants, and Complexity
The UPMP is strongly NP-hard, inheriting the combinatorial hardness from the CPMP even for restricted cases with fixed stack height, as established via reductions from scheduling on permutation graphs (Brink et al., 2014). Major variants include:
- Single-tier, single-access: The classical form, with each bay accessible from one direction.
- Multi-access: Each stack may be served from any of several aisles (N/S/E/W), increasing flexibility but exploding possible configurations (Pfrommer et al., 2022, Pfrommer et al., 2024).
- Multi-tier and multi-bay: Stacking into multiple vertical tiers and handling interconnected bays with joint move scheduling (Pfrommer et al., 2024).
- Operational constraints: Weight limits, concurrent moves by multiple agents, travel cost minimization, and time windows.
Table: Key UPMP variants and their distinguishing features
| Variant | Dimensions & Access | Additional Constraints |
|---|---|---|
| Single-bay, 1-tier | 2D grid, one access side | Classic; minimal configuration |
| Multi-tier | 3D grid, one access | Vertical stacking |
| Multi-access | 2D/3D grid, per-stack dirs | Access assignment, flow-integrated |
| Multi-bay | Several bays, any access | Inter-bay moves, routing |
| Operational ext. | Any | Cost/time, weight, multi-agents |
The complexity of all cases is exponential in the number of units and stacks; for fixed , the minimization problem remains NP-hard (Brink et al., 2014).
3. Exact Solution Methods and Bounds
A* Search with Structured Heuristics
Tree-search (A*, IDA*) constitutes the canonical exact approach, modeling each configuration as a node, with transitions corresponding to single-unit relocations. The evaluation at each node combines the path cost and an admissible heuristic : Lower bounds exploit blocking, "bad→good" moves, and supply-demand arguments as in Bortfeldt & Forster (2012); these are refined for UPMP by extending demand surplus and minimum lane-clearing computations, yielding highly accurate root-node gaps—often within 1–8% of optimum for single- and multi-access cases (Pfrommer et al., 2022, Bömer et al., 27 Jan 2026).
Two-Stage Decomposition: Access Assignment + Sorting
For multi-access and multi-bay settings, the following sequence is standard (Pfrommer et al., 2022, Pfrommer et al., 2024):
- Access Direction Assignment: Stackwise assignment of access directions via a min-cost flow or assignment problem. The cost for each encodes the minimum relocations needed for stack if accessed from , plus penalties for "holes" (unused slots). This stage efficiently predicts, within 5% accuracy, the true number of moves (Pfrommer et al., 2024).
- Sorting Sequence Generation: Given fixed directions, solve the induced "virtual lanes" by A* or, for small cases, by constraint programming (CP). State evaluation leverages a composite lower bound: stackwise and network-flow bounds.
The table below summarizes algorithmic approaches:
| Method | Core Algorithm | Applicability (Size, Variant) |
|---|---|---|
| A* + refined heuristic | Tree search | Single-tier, moderate size |
| Min-cost flow + A* | Assignment + A* | Multi-access, multibay, stackwise |
| Branch-and-price | Column-gen + B&B | Small racks, CPMP adaptation |
| Constraint programming | CP/MIP | Tiny problem instances, rich constraints |
Integer and Set Partitioning Models
ILP/set partitioning formulations enumerate all feasible stack move patterns, with binary variables selecting a pattern for stack . The master problem enforces global constraints, and column generation prices the minimum-cost move pattern for each stack (solvable by dynamic programming in ) (Brink et al., 2014). Direct application to UPMP is feasible for small bay sizes.
4. Heuristic and Metaheuristic Approaches
Classical Heuristics
Greedy, look-ahead, and multi-stage heuristics are widespread, particularly when instance size precludes exact methods. The four-stage deterministic multi-heuristic of Jovanović et al. (2014)—select next container, destination stack, sequencing blockers, and optional stack filling—systematically combines a suite of candidate heuristics, generating a small pool of competitive solutions and achieving up to 10% fewer moves than randomized methods from the container literature (Jovanovic et al., 2014).
Automated Heuristic Synthesis via LLMs
Recent advances employ LLMs for automated heuristic discovery within evolutionary frameworks. The Contextual Evolution of Heuristics (CEoH) (Bömer et al., 5 Mar 2025) and Algorithmic-Contextual EoH (A-CEoH) (Bömer et al., 27 Jan 2026) frameworks prompt LLMs with detailed UPMP context (data formats, problem constraints, sample code, and the actual driver algorithm) to generate new guiding heuristics, refining them over multiple generations via evolutionary operators:
- Exploration (E1/E2): Synthesize novel or conceptually varied heuristics.
- Modification (M1/M2): Revise or tune numeric parameters of existing heuristics.
- Initialization: Seed with diverse baseline strategies.
Combining explicit algorithmic and problem context in prompts (PA-CEoH) produces heuristics that match or surpass manually crafted methods on single-tier, single-access UPMP, achieving fitness gaps (average move penalty over lower bound) as low as 8%, with rapid convergence (Bömer et al., 27 Jan 2026). Mid-sized LLMs (Qwen2.5-Coder:32B) especially benefit from explicit A* code context, while top-tier models (GPT-4o) deliver robust results even with minimal context (Bömer et al., 5 Mar 2025).
5. Empirical Insights, Computational Results, and Scalability
Experimental studies across multiple works demonstrate:
- Impact of Multiple Access Directions: Adding access sides reduces average moves by more than half between one, two, and four accesses; mean runtimes fall from 71s (single-access, T=1) to 7.7s (four-access) on 5×5 bays. The root-node heuristic gap contracts from 7.9% to nearly 0%, making A* efficient for larger instances (Pfrommer et al., 2022, Pfrommer et al., 2024).
- Multi-bay Systems: For multi-bay layouts (up to three interconnected bays, configurations), the two-stage approach solves nearly all benchmark cases to optimality; constraint programming is only practical for problems under 30 stacks (Pfrommer et al., 2024).
- LLM-generated heuristics: On 5×5 instances at 60% fill, the best CEoH heuristic (Qwen2.5-Coder:32B) achieved a fitness gap of 12.5%, compared to 8.15% for optimal A* search; scalability studies confirm robust performance on larger layouts, even as A* becomes computationally infeasible (Bömer et al., 5 Mar 2025).
- Evolutionary convergence: In heuristic evolution, PA-CEoH achieves rapid fitness improvement (converging to in 20 generations), outperforming pure context or pure algorithmic prompts alone (Bömer et al., 27 Jan 2026).
6. Practical Recommendations and Industrial Implications
Empirical findings support several operational directives:
- For high-density, block-stacking warehouses: Design for multiple access directions wherever possible to drastically cut required relocation effort and retrieval latency (Pfrommer et al., 2022, Pfrommer et al., 2024).
- For heuristic design: Embed both algorithmic context (e.g., driver code) and explicit problem state representations; utilize evolutionary prompt frameworks with clear fitness metrics and systematic validation on both in- and out-of-sample scenarios (Bömer et al., 5 Mar 2025, Bömer et al., 27 Jan 2026).
- For large-scale or multi-bay settings: Apply min-cost flow assignment for access direction, followed by A* search with tight, network-flow-based lower bounds. Reserve constraint programming for highly constrained or very small instances (Pfrommer et al., 2024).
- For automated heuristic generation: Supply well-structured, explicit descriptors in prompts, use a mixture of exploratory and exploitative evolutionary strategies, and benchmark against optimal or lower-bound methods to ensure robustness (Bömer et al., 5 Mar 2025, Bömer et al., 27 Jan 2026).
A plausible implication is that advanced heuristic-synthesis frameworks, especially those leveraging LLMs with algorithmic prompt-augmentation, can reduce the reliance on hand-engineered methods and offer scalable, near-optimal solutions to new UPMP variants and related logistics optimization challenges.
7. Research Directions and Ongoing Developments
Key open avenues and ongoing trends include:
- Extension to dynamic, online, or rolling-horizon pre-marshalling in settings with streaming arrivals and departures, dynamic arrivals, or multiple interacting agents.
- Integration of side constraints such as weight, stability, AMR collocation, and time/distance costs in objective functions and heuristic scoring.
- Automated heuristic transfer and meta-learning: Investigating how evolved heuristics for UPMP generalize or can be transferred to related problems such as CPMP, large-neighborhood search, or multi-stage retrieval (Bömer et al., 27 Jan 2026).
- Higher-tier, irregular or heterogeneous stacking systems, where state spaces grow even more rapidly and tree-search becomes infeasible for most practical instances.
- Theory of LLM-guided heuristic quality: Understanding theoretical limitations and formal performance guarantees for LLM-generated heuristics as a function of prompt context and evolutionary framework (Bömer et al., 5 Mar 2025).
These developments highlight the centrality of the UPMP in research at the interface of combinatorial optimization, intralogistics, and automated algorithm design.