GEAKG: Generative Executable Algorithm Knowledge Graphs

Published 30 Mar 2026 in cs.AI and cs.IR | (2603.27922v1)

Abstract: In the context of algorithms for problem solving, procedural knowledge -- the know-how of algorithm design and operator composition -- remains implicit in code, lost between runs, and must be re-engineered for each new domain. Knowledge graphs (KGs) have proven effective for organizing declarative knowledge, yet current KG paradigms provide limited support for representing procedural knowledge as executable, learnable graph structures. We introduce \textit{Generative Executable Algorithm Knowledge Graphs} (GEAKG), a class of KGs whose nodes store executable operators, whose edges encode learned composition patterns, and whose traversal generates solutions. A GEAKG is \emph{generative} (topology and operators are synthesized by a LLM), \emph{executable} (every node is runnable code), and \emph{transferable} (learned patterns generalize zero-shot across domains). The framework is domain-agnostic at the engine level: the same three-layer architecture and Ant Colony Optimization (ACO)-based learning engine can be instantiated across domains, parameterized by a pluggable ontology (\texttt{RoleSchema}). Two case studies -- sharing no domain-specific framework code -- provide concrete evidence for this framework hypothesis: (1)~Neural Architecture Search across 70 cross-dataset transfer pairs on two tabular benchmarks, and (2)~Combinatorial Optimization, where knowledge learned on the Traveling Salesman Problem transfers zero-shot to scheduling and assignment domains. Taken together, the results support that algorithmic expertise can be explicitly represented, learned, and transferred as executable knowledge graphs.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper introduces GEAKG, a domain-agnostic procedural knowledge graph that learns executable operator sequences via LLM synthesis and ACO-based refinement.
The methodology achieves a 100% win rate over random search in NAS benchmarks and transfers zero-shot to combinatorial optimization tasks.
The framework produces a compact, interpretable artifact that enables robust online deployment with near-zero cost, separating learning from execution.

Generative Executable Algorithm Knowledge Graphs (GEAKG): An Expert Analysis

Introduction and Framework Overview

The "GEAKG: Generative Executable Algorithm Knowledge Graphs" (2603.27922) paper introduces a novel procedural knowledge representation paradigm in which executable operator nodes are organized within typed, schema-constrained knowledge graphs, and compositional policies (edge weights and rules) are learned through an Ant Colony Optimization (ACO)-inspired reinforcement regime. Unlike standard KGs which capture only declarative semantics, GEAKG formalizes procedural knowledge—algorithmic know-how, operator composition, strategic sequencing—and enables its generative, executable, and transferable deployment across domains. The framework's architecture is strictly layered (L0 topology, L1 operator pool, L2 learned knowledge), with full separation between domain semantics (imposed by the RoleSchema ontology) and algorithmic reasoning.

Procedural Knowledge Representation and Learning

At the formal core, a GEAKG is instantiated as a six-tuple $(\mathcal{S}, V, E, \Lambda, \Phi, \Sigma)$ , where $\mathcal{S}$ is a domain-specific RoleSchema (abstract roles, categories, allowed transitions), $V$ and $E$ are the graph structure, $\Lambda$ binds each role to a pool of executable operators (as Python code, LLM- or human-generated), $\Phi$ is a pheromone-based transition matrix encoding empirical composition quality, and $\Sigma$ is a set of symbolic rules learned from and enforced during graph traversal.

All knowledge acquisition is performed in an offline phase, leveraging LLMs to generate both graph structure and operator content, and then using a Min-Max Ant System (MMAS) variant of ACO for meta-level learning. The resulting snapshot—a compact graph artifact including topology, operators, and symbolic knowledge—is then deployed in an online phase through a domain-agnostic Symbolic Executor, with strictly zero LLM tokens required for runtime inference.

Figure 1: Symbolic Executor architecture (Online Phase): the GEAKG snapshot is interpreted by a domain-agnostic runtime, decoupling procedural inference from target-domain binding.

The procedural knowledge captured thus becomes an explicit, actively employed and zero-shot-transferable artifact, rather than dissipating as code-level implementation hidden in a single domain.

Cross-Domain Generality and Case Studies

A central hypothesis is that algorithmic expertise captured as a GEAKG exhibits domain-agnosticism at the framework level—i.e., by instantiating only a new RoleSchema and minimal domain binding, a single engine can represent and reuse procedural knowledge across orthogonal task classes. This hypothesis is validated by two deep-dive case studies: Neural Architecture Search (NAS) and combinatorial optimization.

Neural Architecture Search (NAS): GEAKG is deployed as a controller for architecture synthesis, with the graph supporting 18 roles across five semantic categories (Topology, Activation, Training, Regularization, Evaluation). The engine learns procedural strategies on a source NAS benchmark and achieves robust zero-shot transfer of composition policies to other datasets. As evidence, across 70 NAS cross-dataset transfer pairs (spanning NAS-Bench-Graph and NAS-Bench-201), the learned procedural knowledge yielded a 100% win rate over random composition (sequence ablation), with 89% of those wins statistically significant ( $p<0.05$ ).

Figure 2: NAS-Bench-Graph cross-dataset transfer heatmap (accuracy delta Symbolic - Random) shows all cells positive, indicating 100% win rate and robust transfer.

Figure 3: NAS-Bench-201 cross-dataset transfer: all transfer pairs yield positive accuracy deltas (Symbolic - Random).

An important technical observation is the exceptionally low search variance in the symbolic execution regime relative to evolutionary baselines. By enforcing structured, graph-based operator sequencing guided by empirical pheromones, the Symbolic Executor reduced standard deviation by 1.3×–4.8× over RegEvo in representative scenarios—indicating practical utility in stable, robust search policies.

Figure 4: Variance comparison—Symbolic Executor search variance is 1.3–4.8× lower than RegEvo, providing greater stability in neural architecture search.

Combinatorial Optimization: The second case study uses a GEAKG with 11 abstract roles and three primary operator categories (Construction, Local Search, Perturbation), targeting permutation-based domains (e.g., TSP, JSSP, QAP). Here, procedural knowledge (both in terms of search strategies and the operator pool) acquired from TSP is transferred zero-shot to JSSP and QAP—with observed performance competitive or superior to classical domain-specific heuristics on a range of large-scale instances, supporting the claim of cross-domain method transfer. Crucially, domain adaptation is achieved exclusively via the binding interface (i.e., through mapping the context's evaluate function), with no further retraining or LLM use.

Structural and Policy Analysis of Learned GEAKGs

Detailed examination of the learned graphs clarifies multiple aspects of procedural knowledge storage and transfer.

Block-structured Pheromone Matrices: The learned $\Phi$ pheromone matrices display block structures corresponding to RoleSchema categories, encoding robust procedural pipelines inferred by ACO. High-confidence transitions—e.g., Topology $\to$ Activation $\mathcal{S}$ 0 Training $\mathcal{S}$ 1 Regularization $\mathcal{S}$ 2 Evaluation—are consistently pushed towards $\mathcal{S}$ 3, while empirically poor transitions are penalized towards $\mathcal{S}$ 4.
Figure 5: Block-structured pheromone matrix $\mathcal{S}$ 5 for a NAS GEAKG; high-weight edges encode dominant composition pipelines.
Entropy Analysis: The entropy of pheromone distributions converges below uniform, reflecting ACO’s selective reinforcement, while entropy bounds (from MMAS) prevent premature over-convergence, supporting reproducible and varied search.
Figure 6: (a) Pheromone entropy by L0 topology; (b) ACO learning reduces entropy relative to random, while MMAS bounds maintain structural diversity.
Dominant Paths and Symbolic Rule Extraction: Top-traversed paths capture the procedural core of domain expertise—e.g., in NAS, all high-frequency traversals instantiate the canonical category pipeline, and symbolic rules (Horn-style logical predicates associating path prefixes to preferred transitions) are extracted automatically from execution logs.
Figure 7: Top-5 dominant GEAKG traversal paths in NAS mirror hand-engineered category pipelines, encoding procedural NAS expertise.

Empirical Results and Deployment Implications

Empirically, the paper substantiates three core claims:

GEAKG as a procedural KG framework (Generality): Both NAS and combinatorial optimization, despite radical representational and evaluative differences, are tackled via the same runtime with no framework-level code changes.
Zero-shot cross-domain/dataset transfer: The persistent, learned procedural graph generalizes composition strategies across both datasets (NAS) and problem classes (TSP $\mathcal{S}$ 6 JSSP/QAP), encoding reusable algorithmic knowledge not present in monolithic solvers.
Zero-cost online deployment: All computational and knowledge acquisition cost is offline, with the online symbolic executor requiring no LLM calls, resulting in negligible deployment cost even for new domains and large datasets.
Figure 8: Across 70 NAS transfer pairs, the Symbolic Executor achieves a 100% win rate over random composition with 89% significance.

Implications, Limitations, and Future Directions

GEAKG’s decoupling of operator provenance (LLM, code-evolution, or hand-coded) from graph-level execution enables, for the first time, cross-paradigm persistence and recombination of algorithmic building blocks. The symbolic executive and schema constraint mechanisms act as reliability and safety guardrails, ensuring only empirically validated strategies are deployed—addressing the chronic robustness problems seen in pure code-evolution paradigms. Integration with code-evolution methods (e.g., LLaMEA) allows best-in-class operator implementations produced in one domain to be incorporated as transferable units, amortizing both computational and creative costs.

However, the current instantiation imposes manual schema design requirements and the absence of runtime adaptation—once transferred, the symbolic executor does not modify the L2 knowledge in response to target distributional shift. Generalization to non-permutation representations and auto-derivation of schemas are cited as open research directions.

Conclusion

"GEAKG: Generative Executable Algorithm Knowledge Graphs" formalizes and empirically validates a procedural KG paradigm that provides (i) executability at the node level, (ii) empirical meta-learning at the edge level, and (iii) symbolic rule-based decision at the policy level—supporting modular composition, interpretability, and robust cross-domain transfer. The results demonstrate that procedural knowledge, when treated as a first-class graph artifact, is both learnable and productively reusable, offering a durable substrate for algorithmic reasoning and AI knowledge persistence.