Bidirectional Search & Pullback-Based Verification
- The paper introduces SQ-BCP, integrating bidirectional search with pullback-based verification to ensure global goal compatibility and significantly reduce resource-violation rates.
- It presents a precise category-theoretic formalism that resolves unknown preconditions via deterministic refinement and self-querying, ensuring all hard constraints are met.
- Empirical results show SQ-BCP outperforms prior baselines by halving resource violations while maintaining competitive linguistic metrics in diverse task datasets.
Bidirectional search and pullback-based verification constitute the core of Self-Querying Bidirectional Categorical Planning (SQ-BCP), designed for high-fidelity inference-time planning under partial observability with LLMs. SQ-BCP explicitly models and resolves underspecified preconditions via self-querying and bridging hypotheses, executes bidirectional (forward–backward) search through a task category, and globally verifies goal compatibility using categorical pullbacks. This method achieves significant reductions in resource-violation rates compared to prior baselines, ensuring generated plans are both executable and goal-compatible when hard constraints and global compatibility are satisfied (Qu, 27 Jan 2026).
1. Bidirectional Search Architecture
SQ-BCP maintains two simultaneously expanding frontiers:
- : forward-searching from the known initial state
- : backward-searching from the goal specification
Each frontier node is a “state object” , where:
- : resource vector,
- : structural/categorical description,
- : logical predicates,
- : time budget.
Edges correspond to fully refined hypothesis-morphisms whose preconditions have all unknowns () resolved. Candidate morphisms are scored using a weighted sum distance metric: with pruning below a threshold via a softmax-style measure: $\mathit{score}(h) = \exp\left(-D(\Apply(h, w), w^*)/\tau\right)$
Meeting-in-the-middle (MIM) occurs when forward state and backward state satisfy: This triggers pullback-based verification.
Algorithmic steps are formally specified in Algorithm 1 in (Qu, 27 Jan 2026), emphasizing strict insertion of only feasible, fully-checked morphisms and prioritized expansion, with fairness and termination established under bounded branching and refinement depth.
2. Category-Theoretic Planning Formalism
Planning is formulated in a category whose objects are compound state representations . Morphisms implement typed effects, subject to deterministic hard constraints: Precondition sets are
with the unresolved set
Morphisms are considered admissible for expansion only when all preconditions are labeled as and hard constraints are met.
This categorical formalization enables categorical pullback constructions for global verification and introduces rigorous semantics for task composition and compatibility.
3. Pullback-Based Global Verification
Upon bidirectional frontier intersection, compatibility is not established via local constraints alone, but instead via a categorical pullback in . The constraint datum object captures goal requirements, with projection morphisms and .
The pullback square: $\begin{tikzcd} P \arrow[r,"\pi_1"] \arrow[d,"\pi_2"'] & w_f \arrow[d,"C_f"] \ w^* \arrow[r,"C^*"'] & C \end{tikzcd}$ guarantees, by the universal property, a unique factorization for any compatible pair , certifying that the plan chain from initial to goal state is compatible with all hard constraints and goal data.
PullbackVerify returns success if a pullback exists with all induced maps satisfying the relevant constraints. The verification theorem establishes that, given preconditions along the chain are resolved to , all hard intermediate checks pass, and PullbackVerify holds, the chain is categorically compatible with the goal state (Qu, 27 Jan 2026). Completeness and progress (termination with valid output) are guaranteed under bounded branching and finite refinement depth.
4. Deterministic Unknown Precondition Resolution
For candidate morphisms with unresolved preconditions (), SQ-BCP employs Deterministic Refinement (Algorithm 2):
- For each unknown , attempt up to “bridging” actions to establish .
- If bridging fails, issue a self-query to an oracle/user.
- Unknowns so queried are labeled or ; if any are , is discarded.
The process guarantees termination in at most steps. Cycle prevention is implemented by hashing signatures over .
This approach enforces explicit precondition status propagation, providing a strong guarantee that only fully specified, feasible hypothesis-morphisms are expanded.
5. Empirical Results and Comparative Performance
Empirical evaluation was conducted on WikiHow and RecipeNLG task datasets with withheld preconditions, with Resource-Violation Rate (ResViol) as the principal metric: Reference similarity was measured for WikiHow using ROUGE-1 and ROUGE-2 scores, for RecipeNLG using BLEU.
Key comparative outcomes are summarized as follows:
| Method | ROUGE-1 | ROUGE-2 | ResViol (WH) | BLEU | ResViol (RNLG) |
|---|---|---|---|---|---|
| Direct Prompt | 46.3 | 42.1 | 78.3 % | 0.897 | 65.7 % |
| CoT | 48.5 | 44.7 | 83.2 % | 0.900 | 64.1 % |
| ToT | 52.9 | 45.2 | 94.7 % | 0.892 | 66.5 % |
| ReAct | 55.8 | 47.4 | 76.9 % | 0.912 | 59.9 % |
| Self-Ask | 56.1 | 47.4 | 26.0 % | 0.913 | 15.7 % |
| SQ-BCP | 52.7 | 45.9 | 14.9 % | 0.907 | 5.8 % |
Structured precondition tracking and pullback-based verification in SQ-BCP halve resource violations relative to the best prior baseline at a minimal cost to linguistic similarity metrics. Under finite refinement and branching, completeness and soundness are theoretically guaranteed.
6. Significance and Broader Context
SQ-BCP demonstrates that explicit Sat/Viol/Unk semantics, deterministic precondition refinement (bridging and self-querying), and pullback-based categorical verification collectively establish a powerful framework for inference-time planning where observability is partial or incomplete. The categorical abstraction enables modularity, provides interpretable compatibility certificates, and aligns symbolic reasoning quality with deep learning-based planning in partially observable domains.
A plausible implication is that categorical methods, global symbolic verification, and explicit precondition modeling will be increasingly important for robust, executable reasoning in LLMs and compositional AI systems facing incomplete specification scenarios (Qu, 27 Jan 2026).