Authenticated Workflows: A Systems Approach to Protecting Agentic AI

Published 11 Feb 2026 in cs.CR, cs.AI, cs.DC, and cs.MA | (2602.10465v1)

Abstract: Agentic AI systems automate enterprise workflows but existing defenses--guardrails, semantic filters--are probabilistic and routinely bypassed. We introduce authenticated workflows, the first complete trust layer for enterprise agentic AI. Security reduces to protecting four fundamental boundaries: prompts, tools, data, and context. We enforce intent (operations satisfy organizational policies) and integrity (operations are cryptographically authentic) at every boundary crossing, combining cryptographic elimination of attack classes with runtime policy enforcement. This delivers deterministic security--operations either carry valid cryptographic proof or are rejected. We introduce MAPL, an AI-native policy language that expresses agentic constraints dynamically as agents evolve and invocation context changes, scaling as O(log M + N) policies versus O(M x N) rules through hierarchical composition with cryptographic attestations for workflow dependencies. We prove practicality through a universal security runtime integrating nine leading frameworks (MCP, A2A, OpenAI, Claude, LangChain, CrewAI, AutoGen, LlamaIndex, Haystack) through thin adapters requiring zero protocol modifications. Formal proofs establish completeness and soundness. Empirical validation shows 100% recall with zero false positives across 174 test cases, protection against 9 of 10 OWASP Top 10 risks, and complete mitigation of two high impact production CVEs.

Abstract PDF Upgrade to Chat

Summary

The paper introduces authenticated workflows that secure agentic AI using deterministic cryptographic verification and strict intent alignment.
The system integrates seamlessly with nine frameworks via a universal security runtime, demonstrating 100% recall and zero false positives in tests.
Empirical and formal validations confirm robustness against advanced vulnerabilities such as prompt injection and data exfiltration.

Authenticated Workflows: A Systems Approach to Protecting Agentic AI

Introduction

The complexity and security challenges associated with deploying agentic AI systems in production enterprises are increasingly pronounced as these systems handle critical operations like financial transactions, patient records, and infrastructure management. Traditional defenses such as guardrails and semantic filters are inherently probabilistic and often ineffective against sophisticated attack vectors like prompt injection. This paper proposes authenticated workflows as a novel trust layer for securing agentic AI, emphasizing deterministic boundaries across prompts, tools, data, and context.

Security Model

Authenticated workflows are grounded in cryptographic methodologies, offering deterministic security through enforcing two primary properties: intent alignment with organizational policies and cryptographic integrity. The model precludes unauthorized operations and identity spoofing, demanding cryptographic primitives be broken to circumvent defenses. The MAPL policy language further supports dynamic constraints and scalability, ensuring policy efficiency through hierarchical composition and cryptographic attestations.

Technical Implementation

The proposed system integrates seamlessly across nine diverse frameworks, including MCP, OpenAI, and LangChain, through universal security runtime and thin adapters. This integration requires no protocol modifications and ensures complete cryptographic verification across all operational boundaries. Empirical validation exhibits 100% recall with zero false positives in numerous test scenarios, proving complete mitigation of critical vulnerabilities.

Empirical and Formal Validation

Authenticated workflows are empirically validated against extensive attack scenarios, demonstrating comprehensive defense against OWASP Top 10 risks, including prompt injection and data exfiltration. Formal proofs support the system's completeness, soundness, and policy composition properties, addressing all threat vectors under the assumed adversary models.

Discussion and Conclusion

A shift from reactive to proactive security in agentic AI workflows is essential for reliable deployment. Authenticated workflows deliver a robust security framework, ensuring deterministic protection that scales across heterogeneous systems and dynamic contexts. The implications extend to bolstering practical AI trust layers for enterprise applications, crucial for maintaining safe operations as AI integration deepens across sectors.

In essence, authenticated workflows represent a critical advancement in AI system security, reinforcing foundational aspects of integrity and trust across dynamic and distributed computational environments.

Markdown Report Issue

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

Authenticated Workflows: A Systems Approach to Protecting Agentic AI

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Below is a focused list of concrete gaps and open questions that the paper leaves unresolved, intended to guide future research and engineering work.

Formal proofs and artifacts
- The paper cites Lemmas/Theorems (1–7) but does not provide formal models, assumptions, or machine-checked proof artifacts; how to reproduce and validate soundness/completeness claims?
- The “four boundaries are complete and minimal” claim lacks a formal adversarial model covering non-traditional channels (e.g., environment variables, filesystem side-effects, event buses, OS signals); does the minimality proof include these?
Key and identity lifecycle
- How are agent/tool keypairs generated, stored (HSMs/TEEs), rotated, and revoked in production? What are recovery procedures after key compromise?
- How quickly can revocations propagate across distributed PEPs, and how is consistency ensured during network partitions?
- How is multi-tenant isolation enforced in the registry and key infrastructure?
Enforcement integrity (L3) assumptions
- PEP integrity depends on “immutable frameworks” and instrumented APIs; what prevents runtime monkey-patching, dynamic plugin loading, or reflective calls from bypassing PEPs in interpreted languages?
- How to secure PEPs against local privilege escalation, process injection, or supply-chain compromise of adapters/verifiers (which are administrator code)?
- What guarantees exist when one endpoint cannot be wrapped (e.g., third-party SaaS APIs), or when only one-sided wrapping is feasible?
External ecosystem and inter-organizational workflows
- How to extend authenticated workflows to external SaaS tools and LLM APIs that cannot be instrumented or accept signed invocations?
- How are identities, policies, and attestations federated across organizational boundaries (e.g., B2B workflows)? What trust and revocation mechanisms apply across orgs?
Availability and DoS
- The focus is on integrity/authorization; what protections address availability/DoS (e.g., signature verification at every call, registry lookups, verifier execution) and their impact under load or adversarial traffic?
- What happens when the control plane (registry, policy store, logging) is unavailable or partitioned? Is there a safe degraded mode and its security implications?
Performance and scalability
- Sub-millisecond verification is claimed, but there are no microbenchmarks for high-throughput/low-latency settings or end-to-end overhead across long-running, multi-hop workflows.
- Effects on LLM throughput/latency and cost at scale (e.g., tens of thousands of tool calls per session) remain unquantified; what caching strategies and batching help without weakening guarantees?
TOCTTOU and race conditions
- How does the design address time-of-check-to-time-of-use between Stage 3 policy evaluation and operation execution, especially for mutable resources or shared state?
- What concurrency controls exist for attestations in parallel workflows to prevent reordering or partial-order ambiguity?
Attestation trust and semantics
- How to prevent forged or misleading attestations from compromised services (A3)? Are TEEs, remote attestation, quorum attestations, or cross-checks supported?
- How are attestations revoked or invalidated (e.g., if prerequisite steps are later found flawed), and how are downstream dependencies re-evaluated?
- What is the expressiveness limit for temporal constraints (e.g., time-bounded, conditional, or partially ordered dependencies) beyond simple “A before B” attestations?
MAPL expressiveness and operational constraints
- Some constraints require Turing-complete or dataflow-dependent logic (e.g., context-sensitive sanitization); when do custom verifiers become necessary, and how is their correctness assured?
- The “no overrides” policy simplifies proofs but may hinder break-glass scenarios; can time-bounded emergency policies be safely automated and audited at scale?
- How are policy conflicts diagnosed and resolved (e.g., deadlock/liveness issues from intersecting denials), and what tooling supports debugging?
Policy engineering at scale
- The O(log M + N) claim is theoretical; what is the empirical policy count and administrative burden in large enterprises with frequent org changes and dynamic agent creation?
- How are policies versioned, rolled out, and rolled back without breaking running workflows? Is there transactional update support and consistency guarantees?
- What safeguards mitigate risks of policy misconfiguration (which could create either over-permissive or over-restrictive behavior)?
Coverage gaps in threat surface
- The evaluation claims protection against 9/10 OWASP LLM Top 10 risks; which risk remains unmitigated and why? What roadmap addresses it?
- The scope excludes application-level safety (e.g., detecting malicious code or prompt semantics); how should practitioners combine this system with semantic defenses without brittle heuristics?
Confidentiality and covert channels
- The paper emphasizes integrity/authorization; how are confidentiality risks addressed (e.g., exfiltration via allowed operations, covert channels through tool outputs, membership inference via LLMs)?
- Can policies express information flow constraints (e.g., non-interference properties) or require declassification steps with cryptographic proofs?
Audit and privacy
- Non-repudiable audit logs may include sensitive content; how are logs minimized, redacted, encrypted, and access-controlled to meet privacy regulations (e.g., GDPR right to erasure)?
- Are there mechanisms for selective disclosure or zero-knowledge proofs to satisfy auditors without leaking sensitive data?
Adapter ecosystem and maintainability
- Adapters are “thin” (200–500 LOC), but how are they maintained across fast-moving framework updates and API changes? Is there a standardization effort to avoid adapter drift?
- What certification or verification process ensures adapter correctness and prevents backdoors?
Partial adoption and legacy systems
- What security guarantees hold when only some boundaries are instrumented (e.g., S2 tools protected but S3 data retrieval unprotected)? Is there a graded assurance model?
- How can legacy systems without modifiable interfaces be integrated (e.g., via network gateways or proxies) without breaking determinism?
Generalization of empirical results
- The “100% recall/0% false positives” results are limited to 174 test cases; how representative are these of real-world deployments and adversaries? Is there a public benchmark suite?
- How do custom verifiers (which may be heuristic) affect precision/recall in practice, and how are their false positives controlled and measured?
Multi-agent dynamics and liveness
- Intersection semantics ensure monotonic restriction, but what guarantees exist for liveness (i.e., that legitimate workflows can complete) in deeply nested multi-agent delegations?
- How to reason about emergent harms from components that each act within policy but collectively cause unsafe outcomes (policy compositionality vs. system-level safety)?
Cross-framework compositions and edges
- How are non-HTTP transports, event-driven systems, and message queues authenticated and verified within this model?
- Are there canonical bindings for common protocols (gRPC, WebSockets, Kafka) that preserve end-to-end guarantees without excessive overhead?
Governance and organizational process
- Who owns policy authoring and approval (security vs. application teams)? How is separation of duties enforced and audited?
- What human-in-the-loop controls exist for sensitive actions, and how are they authenticated and attested without undermining automation?
Supply-chain and model integrity
- The design assumes cryptographic hardness but does not address compromised model weights, fine-tuning artifacts, or data poisoning that alter LLM behavior within allowed operations.
- How is the integrity of custom verifiers and PEP binaries ensured (e.g., reproducible builds, code signing, SLSA levels)?
Edge/embedded deployment
- How does the approach perform on constrained or offline environments (mobile, on-prem IoT) where registry access and frequent key verification are costly or intermittent?
Standards and interoperability
- Is there a plan to standardize invocation formats, policy schemas, and attestation structures to foster ecosystem adoption beyond the nine frameworks?
Diagnostics and developer experience
- What tools help developers understand “why” an operation was denied (policy diffing, trace visualization) and suggest minimal policy changes without breaking guarantees?
- Can MAPL policies be statically analyzed, type-checked, or formally verified to prevent unsafe patterns before deployment?
Future-proofing cryptography
- How will the system migrate to post-quantum cryptography, and what is the impact on performance and key management during hybrid transitions?

These gaps suggest concrete research and engineering directions: formalizing and open-sourcing proofs and benchmarks; specifying standardized invocation/attestation schemas; building robust key/identity lifecycle tooling; developing liveness-aware policy analysis; establishing verifiable adapters and governance processes; and integrating confidentiality and information-flow controls alongside the presented integrity-focused framework.

View Paper Prompt View All Prompts

Practical Applications

Immediate Applications

The following applications can be deployed now using the paper’s authenticated workflows, MAPL policy language, and the universal security runtime with thin adapters across nine frameworks.

Enterprise AI trust layer for agentic systems
- What: Deploy the universal security runtime with Policy Enforcement Points (PEPs) across LangChain, CrewAI, AutoGen, LlamaIndex, Haystack, MCP, and LLM APIs (OpenAI, Claude) via thin adapters to protect prompts, tools, data, and context.
- Sectors: Software/IT, enterprise platforms
- Tools/products/workflows: PEP SDKs, MAPL policy authoring tools, agent/tool registries, audit-log services using hash chains/Merkle trees, admin dashboards
- Assumptions/dependencies: Enterprise IAM integration, key management (per agent/tool), trusted control plane (policy store, registry), adherence to L3 (enforcement integrity)
Prompt-injection resilience in RAG pipelines
- What: Enforce document access policies and signed retrieval; treat data as untrusted; verify signed invocations and authenticated context to prevent data-triggered tool misuse.
- Sectors: Knowledge management, customer support, internal search
- Tools/products/workflows: LlamaIndex/Haystack adapters, StorageIntegrityVerifier, MAPL constraints on retrieval sources and parameters
- Assumptions/dependencies: Integration with vector DBs/document stores, correct configuration of resource/parameter constraints
Secure tool invocation and least privilege for agents
- What: Apply MAPL policies and ToolAuthorizationVerifier to bound filesystem access, command execution, DB queries, and email sending; independently verify at each tool boundary.
- Sectors: Finance (ops automation), IT operations, back-office process automation
- Tools/products/workflows: Per-tool identities/keys, RBAC with MAPL, deny/allow lists for sensitive operations
- Assumptions/dependencies: Accurate resource modeling, parameter-level controls (e.g., path patterns, recipient allowlists)
Audit-ready, non-repudiable AI operations
- What: Generate tamper-evident logs with hash chains; sign both invocations and results to enable forensic analysis and compliance reporting.
- Sectors: Compliance, risk, legal
- Tools/products/workflows: Audit-log services, signature verification pipelines, exportable evidence packages for SOC 2/HIPAA/GDPR controls
- Assumptions/dependencies: Secure log storage, proper time-stamping, clear retention policies
Regulated data handling via attestations
- What: Enforce “export only after anonymization attestation exists” (or DLP checks completed) using MAPL attestation dependencies with cryptographic proofs.
- Sectors: Healthcare (PHI/PII), public sector, HR
- Tools/products/workflows: WorkflowIntegrityVerifier, PII detection verifiers, anonymization tools with signed completion attestations
- Assumptions/dependencies: Correct verifier configuration; acceptance of cryptographic attestations in internal compliance processes
DevOps and cloud automation hardening
- What: Require signed, policy-bound invocations for infra operations (e.g., IaC changes, deployments); enforce command and API call constraints with independent verification at each boundary.
- Sectors: Software/cloud, platform engineering
- Tools/products/workflows: AutoGen code-execution wrappers, CI/CD adapters, geofencing/rate-limiting verifiers
- Assumptions/dependencies: Integration with cloud APIs, accurate resource/parameter policies, secure key storage
Safer email assistants and communications tools
- What: Wrap send_email and messaging tools with PEPs; constrain recipients, domains, content types, and attachment sources; require signed operations.
- Sectors: Enterprise productivity, daily life
- Tools/products/workflows: Email tool adapters, MAPL recipient allowlists/denials, content scanning verifiers
- Assumptions/dependencies: Email API integration, policy maintenance for authorized recipients and headers
Browser/scraper agent hardening against malicious content
- What: Treat page content as untrusted data; ensure tool actions (credential access, downloads, command execution) require signed, policy-bound invocations; mitigate Atlas-style prompt injection cascades.
- Sectors: Marketing intelligence, competitive analysis, data aggregation
- Tools/products/workflows: HTTP client wrappers with PEPs, content sanitization verifiers, strict tool parameter controls
- Assumptions/dependencies: Coverage of high-risk tools (credentials, file system), handling of dynamic content and redirects
Scoped delegation and cross-team collaboration
- What: Use A2A-style signed delegation tokens with MAPL intersection semantics to ensure delegates cannot gain broader permissions than their grant.
- Sectors: Enterprise collaboration, IT governance
- Tools/products/workflows: Delegation token service, policy intersection workflows, revocation mechanisms
- Assumptions/dependencies: Registry and token lifecycle management, organizational hierarchy reflected in MAPL
Context integrity in multi-turn sessions
- What: Apply authenticated context with hash chains, sequence numbers, and tamper-evident session state across LangChain/CrewAI/AutoGen memory.
- Sectors: Software/IT, customer support, sales assistants
- Tools/products/workflows: MemoryIntegrityVerifier, context signing, policy-bound memory operations
- Assumptions/dependencies: Integration with orchestration memory APIs, performance tuning for sub-ms overhead
Security red-teaming and OWASP LLM Top 10 coverage
- What: Use the runtime’s deterministic enforcement to test agent applications against OWASP LLM risks; leverage empirical results (9/10 risks mitigated) and plug verifiers for coverage.
- Sectors: Cybersecurity, QA
- Tools/products/workflows: Red-teaming harnesses, risk-specific verifiers (path traversal, exfil prevention, workflow hijacking), reporting dashboards
- Assumptions/dependencies: Test case libraries, controlled staging environments
Policy management at scale
- What: Replace O(M×N) rule sprawl with MAPL’s hierarchical O(log M + N) policies; use inheritance and intersection to enforce monotonic restriction and transitive denial.
- Sectors: IT governance, platform teams
- Tools/products/workflows: MAPL compiler/validator, org hierarchy importers, policy provenance and diff tooling
- Assumptions/dependencies: Well-maintained org hierarchies; careful use of extends chains and deny patterns

Long-Term Applications

The following applications are promising but require further research, ecosystem scaling, protocol standardization, or vendor cooperation before broad deployment.

Cross-vendor standardization of authenticated workflows and MAPL-like policies
- What: Establish open standards for agent identities, signed invocations, attestations, and intersection-based policy semantics across A2A/MCP/LLM APIs.
- Sectors: Software, policy
- Tools/products/workflows: RFCs/specs, interoperability test suites, certification programs
- Assumptions/dependencies: Multi-vendor alignment, standards bodies involvement, reference implementations
Hardware-backed keys and secure enclaves for agent/tool identities
- What: Bind agent/tool keys to TPM/HSM/TEE/FIDO devices for stronger compromise resistance and regulated environment compliance.
- Sectors: Healthcare, finance, energy, defense
- Tools/products/workflows: Hardware key provisioning, attested execution, secure key rotation
- Assumptions/dependencies: Device support, supply chain readiness, FIPS/CC certifications
Federated, cross-organization agent ecosystems
- What: Enable inter-company authenticated workflows with mutual policy intersection, delegation tokens, and transitive attestations for supply chain automation.
- Sectors: Logistics, finance (trade finance), manufacturing
- Tools/products/workflows: Federation registries, cross-org policy negotiation, legal/compliance overlays
- Assumptions/dependencies: Legal agreements (data-sharing, liability), interoperable trust layer adoption
Certified marketplaces for agent tools and verifiers
- What: Create an ecosystem where tools/verifiers ship with cryptographic attestations and security profiles; enterprises choose certified components.
- Sectors: Software/platforms
- Tools/products/workflows: Marketplace portals, certification criteria, continuous attestation pipelines
- Assumptions/dependencies: Certification authorities, ongoing security audits, vulnerability disclosure processes
Cloud-native managed trust layers
- What: Offer authenticated workflow enforcement as a managed service (PEP-as-a-Service) embedded in LLM providers and cloud platforms.
- Sectors: Cloud, SaaS
- Tools/products/workflows: Managed registries, policy stores, observability/incident tooling
- Assumptions/dependencies: Provider support, SLAs for sub-ms verification, multitenancy isolation
Regulatory incorporation of cryptographically attested AI operations
- What: Update compliance frameworks (HIPAA, GDPR, SOC 2, PCI) to explicitly accept signed invocations/attestations and tamper-evident logs as controls.
- Sectors: Policy, compliance
- Tools/products/workflows: Regulator guidance, audit templates, evidence exporters
- Assumptions/dependencies: Regulator engagement, standards alignment, industry proofs-of-concept
Authenticated command pipelines for robotics and industrial automation
- What: Require signed, policy-bound commands and attested execution order for robots/PLC/SCADA systems to prevent unsafe operations and escalation.
- Sectors: Robotics, manufacturing, energy
- Tools/products/workflows: Real-time PEPs, safety verifiers (area/force limits), attested maintenance workflows
- Assumptions/dependencies: Deterministic latency guarantees, integration with legacy controllers
Smart grid and critical infrastructure protection
- What: Deploy authenticated workflows for grid control, telemetry retrieval, and incident response with non-repudiation and strict policy constraints.
- Sectors: Energy, utilities
- Tools/products/workflows: Grid control PEPs, geofencing and rate-limit policies, emergency access groups with time-bounded validity
- Assumptions/dependencies: Vendor cooperation, resilience under outages, secure failover
Consumer-grade personal assistants with household policies
- What: Enforce signed, policy-bound actions across IoT devices (locks, thermostats, payments) to prevent unsafe or unauthorized assistant behavior.
- Sectors: Consumer/IoT, daily life
- Tools/products/workflows: Home agent OS, per-device identities, parent/guardian policy templates
- Assumptions/dependencies: Device ecosystem support, simple policy authoring UX, recovery from compromised keys
Open academic testbeds for deterministic multi-agent security
- What: Provide research platforms with authenticated workflows, MAPL, and verifiers for studying compositional attacks and formal security guarantees.
- Sectors: Academia
- Tools/products/workflows: Open-source runtimes, attack libraries, coursework materials
- Assumptions/dependencies: Funding, community maintenance, standardized datasets/scenarios
End-to-end attested knowledge ecosystems
- What: Track cryptographic provenance from content ingestion through RAG retrieval to tool actions, enabling trustworthy knowledge pipelines.
- Sectors: Education, enterprise knowledge, media
- Tools/products/workflows: Content signing, retrieval policies, provenance-aware UIs/reporting
- Assumptions/dependencies: Content-owner cooperation, signing infrastructure, performance trade-offs
Automated policy synthesis and drift detection
- What: Use program analysis or LLM-assisted tooling to generate MAPL policies from workflows, detect drift, and suggest least-privilege updates.
- Sectors: Software/IT governance
- Tools/products/workflows: Policy synthesis engines, explainable diffs, verification sandboxes
- Assumptions/dependencies: Reliable model-guided synthesis, human-in-the-loop validation, safety guarantees against over-permissive outputs

View Paper Prompt View All Prompts

Open Problems

General solution to prompt injection attacks

Continue Learning

Authors (2)

Collections

Tweets

HackerNews

Authenticated Workflows: A Systems Approach to Deterministic Agentic Controls (3 points, 1 comment)

Authenticated Workflows: A Systems Approach to Protecting Agentic AI

Summary

Authenticated Workflows: A Systems Approach to Protecting Agentic AI

Introduction

Security Model

Technical Implementation

Empirical and Formal Validation

Discussion and Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Practical Applications

Immediate Applications

Long-Term Applications

Open Problems

Continue Learning

Related Papers

Authors (2)

Collections

Tweets

HackerNews