Dominator-Based MILP Simplification Framework
- The paper presents a dominator-tree strategy that fixes variables and reduces constraints in MILP formulations for flow decomposition, achieving significant computational speed-ups.
- The approach uses safe sequence identification, collapsing maximal univocal paths to generalize flow decomposition from DAGs to cyclic graphs.
- The method integrates linear-time preprocessing with MILP formulation, streamlining constraints and variables and yielding dramatic runtime improvements on biological datasets.
The dominator-based MILP simplification framework addresses flow decomposition problems on general (possibly cyclic) directed graphs via a graph-theoretic technique built on dominator trees. It enables fast and flexible Mixed Integer Linear Programming (MILP) formulations for decomposing flows into walks or paths. Central to the approach is the identification and exploitation of "safe sequences" of edges—those that must necessarily appear as subsequences in any walk cover—allowing for substantial MILP simplification through variable fixing and constraint reduction. This methodology generalizes previous work limited to directed acyclic graphs (DAGs) and is validated with significant computational speed-ups on biological datasets (Sena et al., 24 Nov 2025).
1. Graph-Theoretic Foundations and Dominator Trees
Let denote a directed graph (possibly with cycles) with distinguished source and sink , referred to as an s–t graph. The notion of domination is defined as follows:
- s-dominates (denoted ) if every – walk passes through .
- t-dominates (denoted ) if every – walk passes through .
The immediate s-dominator of () is the unique strict s-dominator of minimal under the domination ordering. The s-dominator tree organizes such that and is rooted at . Analogously, the t-dominator tree is defined with roles reversed for .
These dominator trees capture the structural constraints on walks traversing from to , serving as the basis for safe sequence identification and leveraging in flow decomposition MILPs.
2. Safe Sequences: Characterization and Structural Theorems
Given a collection of edges to be covered, an s–t walk cover is a set of s–t walks such that every edge is present in at least one walk.
A sequence of edges is C-safe if in every s–t walk cover of , at least one walk contains as a subsequence. The connection to dominator trees is formalized by defining, for each :
Theorem: A sequence of edges is -safe if and only if there exists with a subsequence of . Every maximal safe sequence is exactly for some , where is a common leaf of the (possibly collapsed) and .
To address long chains of non-branching dominators, maximal "univocal" paths—paths shared by both and without branching—are collapsed into single super-vertices. In the collapsed trees, the set of maximal safe sequences corresponds exactly to the set of extensions of common leaves.
The number of maximal safe sequences is and the total output size is .
3. Enumeration of Maximal Safe Sequences in Linear Time
All maximal safe sequences can be enumerated exactly once in time given an s–t graph and subset .
Procedure:
- Compute (dominator tree rooted at ) using the Lengauer–Tarjan algorithm in .
- Compute (dominator tree rooted at on the edge-reversed graph, then reverse back).
- For both and , mark all vertices present on for some covering an edge in .
- Collapse maximal univocal paths in each tree to single nodes, maintaining the path sequences.
- For each vertex that is simultaneously a leaf in both collapsed trees, reconstruct by concatenating the stored sequences and output to .
The process is dominated by the sum total length of all safe sequences, in addition to graph traversal and tree operations in . The overall complexity is (Sena et al., 24 Nov 2025).
4. Integration with MILP Flow Decomposition Models
The dominator-based approach simplifies and accelerates MILP models for flow decomposition into walks.
4.1 Standard (Unsimplified) MILP
For minimum-flow-decomposition (k-FD), the unsimplified MILP consists of:
- Variables: (number of times walk traverses ), (reachability helper variables), (depth labels), (walk weights), and slack/error variables as appropriate.
- Constraints: flow conservation, tree selection (), vertex in-degree via variables, acyclicity/depth, and bilinear flow matching.
4.2 Safety-Based Preprocessing
Given maximal safe sequences , forming a maximum-weight antichain:
- For , assign to walk without loss of generality.
- For every edge :
- If lies between distinct SCCs, set (no repetitions).
- Otherwise, set (as determined by sequence multiplicity).
- For every incompatible edge , enforce except when reachability conditions (specified in three cases) are met; these checks require only two BFS/DFS traversals.
The antichain is computed on the condensation DAG, reducing to a max-flow problem in . The result is that many -variables are fixed and others set to zero, shrinking the MILP feasible region and simplifying constraint structure.
4.3 Reduced MILP and Complexity
After safety-based preprocessing:
- Many variables and constraints (including most bilinear products and reachability/depth constraints) are eliminated for walks .
- The simplified MILP for k-FD includes only flow-conservation, forced traversals, and edge elimination based on safe sequences for corresponding walks, plus residual variables for the remaining unfixed walks.
- Preprocessing, including safe sequences and assignment, is , typically negligible compared to MILP solution times.
5. Empirical Speed-Ups and Practical Performance
Using four bacterial assembly datasets and three flow-decomposition objectives (Minimum Flow Decomposition, Least Absolute Errors, Minimum Path Error), dominator-based preprocessing yields dramatic computational improvements:
- Up to faster on Minimum Flow Decomposition,
- Up to faster on Least Absolute Errors,
- Up to faster on Minimum Path Error.
Instances previously timing out at $300$ seconds are solved in under $30$ seconds.
This demonstrates that dominator-tree safe-sequence fixing is an effective linear-time preprocessing technique. It drastically reduces MILP search spaces and avoids bilinear encodings, all while maintaining solution exactness on general (cyclic) graphs (Sena et al., 24 Nov 2025).
6. Applications and Broader Implications
The dominator-based MILP simplification framework provides robust algorithmic tools for multi-assembly problems and general flow decomposition tasks in graph analysis. Its flexibility with cycles, provable model-size reductions, and empirical acceleration mark it as a foundational building block for future multi-assembly applications. This suggests that dominator-driven approaches could generalize to other combinatorial optimization problems involving path- or walk-covers in complex networks. A plausible implication is that further dominator-theoretic simplifications could be discovered for other MILP-based graph inference problems.