CodeDelegator: Secure Code Delegation

Updated 28 January 2026

CodeDelegator is a set of security and reliability-focused architectures that mediate code delegation using explicit mechanisms to enforce least privilege, non-repudiability, and traceability.
It employs static delegation, mediation requests, and role separation between persistent delegators and ephemeral coders to prevent context pollution and ensure accurate outcomes.
Experimental evaluations demonstrate significant performance gains and improved accuracy metrics, while also highlighting challenges in scalability and adaptive policy management.

CodeDelegator is a general term for a class of security and reliability–focused code delegation architectures that have emerged across domains including distributed computing, cloud service orchestration, capability systems, trusted ML offload, and multi-agent LLM frameworks. These systems share the common property that the right to execute code on a remote agent is not transferred monolithically or solely by credential, but is instead carefully mediated, parsed, limited, and audited through explicit mechanisms that enforce non-repudiability, least privilege, traceability, and, where applicable, correctness of outcomes, even in adversarial, federated, or untrusted settings (Schreiner et al., 2011, Renesse et al., 2012, Arun et al., 26 Feb 2025, Zhang, 2018, Fei et al., 21 Jan 2026).

1. Delegation Models: From Classical Grid to Multi-Agent LLMs

Initial CodeDelegator models were formalized in scientific grid computing, most notably in the ALICE experiment’s mediated definite delegation model. Here, three roles are defined: Principals (delegators), Brokers (verifying/intervening intermediaries), and Agents (executors). Delegation is a composition of mappings, with formal objects such as mediated concessions, mediation requests, and relay operations:

Static delegation ( $\delta$ ): $U \times P \times E \times A \times T \to C$ , assigning privilege $p$ on entity $e$ to agent $a$ for principal $u$ at time $t$ .
Mediation request ($\phi$): $U \times P \times E \times B \times T \to \bar{C}$ , a non-mediated concession submitted to a broker.
Mediated delegation ( $\psi$ ): $U \times P \times E \times A \times T \to C$ 0, creating context-sensitive, verifiable assignments via transformations $U \times P \times E \times A \times T \to C$ 1.
Relay ( $U \times P \times E \times A \times T \to C$ 2): enabling relay and transformation chains among brokers (Schreiner et al., 2011).

Trusted multi-agent LLM systems transfer these principles to the code-as-action domain. CodeDelegator (in the LLM context) separates a persistent Delegator agent (strategic planner and monitor) from ephemeral Coder agents (isolated code writers/runners), using the principle of Ephemeral-Persistent State Separation (EPSS) to prevent context pollution, error propagation, and planning degradation. Each code execution is spawned in a clean namespace; only structured, validated artifacts are committed to the global orchestration layer (Fei et al., 21 Jan 2026).

2. Security, Correctness, and Accountability Objectives

Across designs, CodeDelegator frameworks are defined by rigorous policy enforcement and trust-minimization:

Non-repudiation and accountability: Every transformation or submission step (code, parameters, plan, execution result) is cryptographically signed and/or logged in a tamper-evident chain. Neither brokers nor agents nor users can forge, repudiate, or surreptitiously alter delegated code or outputs.
Least-privilege enforcement: Agent-side enforcement using OS namespaces, containers, or codecap rights is required to assure code cannot exceed delegated authorities (Schreiner et al., 2011, Renesse et al., 2012).
Isolation: Separate sessions, state spaces, and credential domains ensure that failures or compromise in one code path (e.g., a debugging trace or malicious sub-task) do not leak, pollute, or subvert cross-task or cross-user privilege boundaries.
Transparent, site-local accounting: Auditable logs at the level of code-submitting principal, on the resource site itself, complete the accountability chain (Schreiner et al., 2011).

3. Cryptographic and Formal Mechanisms

Three main families of cryptographic approaches underlie CodeDelegator frameworks:

Composite Signatures and X.509 Chains: The mediated definite delegation model employs nested, time-bounded, broker- and client-signed Job Description Language (JDL) objects as immutable evidence for job authenticity and broker integrity (Schreiner et al., 2011).
Cryptographically-Protected Code Capabilities: “Codecaps” embed executable code policies (e.g., in JavaScript) in X.509 proxy certificate chains, providing discretionary, least-privilege access, fine-grained delegation, confinement by maximum path length, revocation by version/expiry, and rights amplification for secure abstraction layers (Renesse et al., 2012).
Refereed Delegation and Deterministic Execution: In ML code offload, refereed delegation (instantiated in Verde) assigns a program to multiple untrusted providers and arbitrates disputes with minimal referee work via Merkleized state checkpoints and operator-level reproduction (RepOps library). Bitwise reproducibility eliminates hardware nondeterminism as a source of dispute (Arun et al., 26 Feb 2025).

For quantum delegation, CodeDelegator implements “reversible garbled circuits” for C+P quantum circuits (Toffoli and phase gates), using a quantum random oracle and quantum KDM-secure symmetric encryption. All sensitive circuit keys are randomly encoded, and server-side evaluation never reveals the client’s logical input or intermediate states (Zhang, 2018).

4. Framework Architectures and Protocols

CodeDelegator architectures exhibit the following architectural patterns:

Model	Delegator Role	Agent/Worker Role	Broker Mechanism	Enforcement
ALICE Grid	User (principal)	Pilot job/container	VO service as broker	Signed JDL, gLExec, TLS
Codecap system	Owner/principal	Delegatee principal	Chained signed codecap	Code-based policies in certificate
Verde (ML referee)	Client (referee)	ML provider	Output commitment/Merkle proof	Bitwise deterministic RepOps
CodeDelegator-LLM	Delegator agent	Ephemeral Coder agent	EPSS + orchestration layer	Artifact schema, explicit merge policy

For LLM-based systems, orchestration is implemented via persistent Delegator contexts and ephemeral Coder contexts, with state merging only for schema-validated artifacts. For classical grid or capability systems, the chain-of-signed commands or policies governs enforcement; each delegated operation includes sufficient evidence for independent validation and, in ML or quantum settings, detection and protocol-level proof of cheating (Fei et al., 21 Jan 2026, Arun et al., 26 Feb 2025, Zhang, 2018, Renesse et al., 2012, Schreiner et al., 2011).

5. Experimental Evaluation and Performance

CodeDelegator implementations deliver measurable improvements in code-as-action LLM benchmarks and practical ML orchestration:

On the $U \times P \times E \times A \times T \to C$ 3-bench (Retail), CodeDelegator attains pass $U \times P \times E \times A \times T \to C$ 4/ $U \times P \times E \times A \times T \to C$ 5/ $U \times P \times E \times A \times T \to C$ 6/ $U \times P \times E \times A \times T \to C$ 7 accuracies of 82.0/71.2/63.4/57.0, outperforming both ReAct and CodeAct by at least 2.4% absolute on pass $U \times P \times E \times A \times T \to C$ 8 and by up to 13.2% on pass $U \times P \times E \times A \times T \to C$ 9 (Fei et al., 21 Jan 2026).
On MCPMark, overall pass@1 improves to 38.4% compared to 25.8% (ReAct) and 26.4% (CodeAct), with the FileSystem and GitHub domains showing the largest gains (14.2% and 10.1% absolute over CodeAct).
Ablation studies confirm the necessity of both EPSS and explicit role separation; removing EPSS degrades pass@1 by 4.7%, and removing role separation by 10.5%.
In ML program refereed delegation, total overhead remains less than $p$ 0 (vs. up to $p$ 1 for zero-knowledge proof approaches); RepOps overhead for matmul benchmarks drops from 1.5–3 $p$ 2 for $p$ 3 to $p$ 41.3 $p$ 5 (T4) and 1.6 $p$ 6 (RTX 3090) for $p$ 7 (Arun et al., 26 Feb 2025).

6. Limitations, Extensions, and Open Directions

CodeDelegator systems continue to evolve:

Scalability and parallelism: Present LLM-based CodeDelegator implementations do not model parallel or DAG-structured plans; all Coders execute serially under Delegator control (Fei et al., 21 Jan 2026). Extending to DAG orchestration and parallel agent execution is a suggested direction.
Heuristic policies: Retry and replan logic in orchestration agents are based on fixed budgets and fixed error classifiers; learning adaptive policies is an open area.
Robustness and key management: Grid-centered systems face challenges with key revocation, clock synchronization, audit log retention, and secure broker HSM management (Schreiner et al., 2011).
Generalizability: The EPSS–role separation pattern applies to broader classes of multi-agent, hierarchical tasks beyond code generation (summarization, complex QA, robotics) (Fei et al., 21 Jan 2026).

Future work includes richer revocation semantics, hardware-anchored time-stamping, unified deterministic backends across hardware (for ML/other code), and robust multi-broker chain-of-signature protocols (Schreiner et al., 2011, Arun et al., 26 Feb 2025).

7. Connections to Broader Delegation and Capability Research

CodeDelegator models are directly connected to research in mediating and verifying code execution across federated, adversarial, or privacy-sensitive environments. The design philosophy—explicit separation of privilege, context, and state, fully mediated brokering, and cryptographically grounded chains of evidence—provides a foundation for verifiable, least-privilege, and auditable code execution and can be seen as a unifying principle in trustworthy distributed and agent-based system design (Schreiner et al., 2011, Renesse et al., 2012, Arun et al., 26 Feb 2025, Zhang, 2018, Fei et al., 21 Jan 2026).

References: (Schreiner et al., 2011, Renesse et al., 2012, Arun et al., 26 Feb 2025, Zhang, 2018, Fei et al., 21 Jan 2026)