Papers
Topics
Authors
Recent
Search
2000 character limit reached

A Methodology for Selecting and Composing Runtime Architecture Patterns for Production LLM Agents

Published 19 May 2026 in cs.AI and cs.SE | (2605.20173v1)

Abstract: Production LLM agents combine stochastic model outputs with deterministic software systems, yet the boundary between the two is rarely treated as a first-class architectural object. This paper names that boundary the stochastic-deterministic boundary (SDB): a four-part contract among a proposer, verifier, commit step, and reject signal that specifies how an LLM output becomes a system action. We argue that the SDB is the load-bearing primitive of production agent runtimes. Around this primitive, we organize agent runtime design into three concerns: Coordination, State, and Control. We present a catalog of six runtime patterns that compose the SDB differently across conversational, autonomous, and long-horizon agents: hierarchical delegation, scatter-gather plus saga, event-driven sequencing, shared state machine, supervisor plus gate, and human in the loop. For each pattern, we trace its lineage to distributed-systems concepts and identify what changes when the worker is stochastic. The paper contributes a five-step methodology for selecting runtime patterns, a diagnostic procedure that maps production failures to pattern weaknesses, and a failure mode called replay divergence, in which LLM-based consumers of a deterministic event log produce different downstream outputs under model-version or prompt changes. A stylized reliability decomposition separates per-call model variance from architectural momentum, motivating the claim that as model variance decreases, pattern choice and SDB strength become increasingly important levers for long-run reliability. We apply the methodology to five workloads and provide one runnable reference implementation for a 90-day contract-renewal agent.

Authors (1)

Summary

  • The paper introduces a formal methodology that uses the stochastic-deterministic boundary (SDB) contract to ensure production reliability.
  • It categorizes runtime concerns into Coordination, State, and Control while detailing six architecture patterns tailored for LLM agents.
  • Empirical validations demonstrate that optimal pattern selection enhances system traceability and auditability in diverse real-world workloads.

A Methodological Framework for Runtime Architecture Patterns in Production LLM Agents

Introduction and Thesis

This work introduces a formal methodology for designing and assembling runtime architectures for production-scale LLM agents, centering the concept of the stochastic-deterministic boundary (SDB) as the core architectural primitive. The SDB captures the runtime interface where the stochastic outputs of an LLM are transformed into deterministic system actions through a four-part contract: proposer, verifier, commit, and reject. This contract systematizes the mediation between probabilistic model outputs and deterministic system invariants, which is crucial for achieving production reliability as LLMs become increasingly capable and variance decreases.

Production agent reliability is modeled as y(t)=μt+σξ(t)y(t) = \mu t + \sigma \xi(t), where the per-call variance σ\sigma due to stochasticity compresses over time with new model generations, leaving architectural momentum μ\mu (determined by the design of the runtime patterns and SDB strength) as the dominant factor in long-horizon reliability. Thus, architectural decisions, not model quality alone, govern the reliability and safety envelope of LLM-driven systems (2605.20173).

The Stochastic-Deterministic Boundary: Definition and Empirical Validation

The SDB is defined as a four-step contract:

  1. Proposer: The LLM’s stochastic output.
  2. Verifier: Deterministic and systematic check over proposals (schema, policies, classifiers).
  3. Commit: Durable write or action once verification passes.
  4. Reject Signal: Typed feedback to the proposer on failed verification.

The audit of 21 LLM-to-action transitions across major frameworks (OpenAI/swarm, AutoGPT, LangChain Agents, CrewAI, Microsoft AutoGen) finds explicit SDB instantiations in 19, while classification of 21 post-mortems shows 15 localizing failures directly at the boundary and 17 fixes strengthening components thereof. Empirical cases illustrate that mis-specification of verifier or reject semantics leads directly to safety and performance defects, supporting the claim that the SDB is not merely an abstraction but the operational locus of failure and reliability.

Three Orthogonal Concerns and Six Pattern Catalog

Three enduring runtime system concerns are identified—Coordination, State, and Control—each inheriting from classical distributed systems but requiring new pattern instantiations due to LLM stochasticity:

  • Coordination: Work decomposition and result composition, inherited from the actor model and sagas.
  • State: Memory and consistency, informed by CAP, event sourcing, and state machines.
  • Control: Oversight and gating, leveraging supervision and human-in-the-loop (HITL) paradigms.

The paper catalogs six patterns, with explicit mapping to their distributed systems antecedents and the associated SDB demarcation. The catalog includes: Hierarchical Delegation, Scatter-Gather plus Saga, Event-Driven Sequencing, Shared State Machine, Supervisor plus Gate, and Human in the Loop. For each, critical pathologies and corrective actions are classified, emphasizing failure diagnostics at the architecture level rather than model choice.

Pattern Selection and Methodological Procedures

A rigorous five-step methodology is prescribed for architecture selection:

  1. Runtime Classification: Classify workload as Conversational, Autonomous, or Long-Horizon—determining the dominant architectural concern.
  2. Spine Selection: Choose the State backbone (P3/P5) based on pause duration, reconstructibility, and world mutability predicates.
  3. Coordination Pattern: Select P1 or P2 depending on ownership, independence, and compensation requirements.
  4. Control Layer: Always include deterministic gate if side effects exist; escalate to HITL if risk or audit requirements justify it.
  5. Build Sequence: Enforce dashboard-first development for observability, creating an actionable audit trail before deploying agentic automation.

Every architectural choice is recorded in a canonical six-line artifact encompassing rationale and diagnostic signatures for misfit. This enables systematic review, replication, and post-hoc audit.

Failure Modes, Reliability Analysis, and Diagnostics

The failure diagnostic procedure distinguishes between functional bugs, architecture-related drift (flat or negative μ\mu), and cross-version replay divergence. The latter is an SDB-specific failure mode: replaying event logs through a different model version yields divergent system states, exposing flaws in pattern selection (especially with event-driven patterns in mutable environments).

Statistically, as σ\sigma contracts with better base models, mis-specification, and lack of rigor at the SDB and in the selection of runtime patterns become the dominant source of degraded reliability.

Validation via Reference and Worked Applications

Application of the methodology to five diverse workloads demonstrates discriminative power and robustness. Critically, even within the same runtime class (e.g., Long-Horizon), different SDB pattern selections are shown to be optimal depending on state reconstructibility and world volatility. A reproducible reference implementation on the IBM Telco Churn dataset operationalizes the methodology, exercising all six patterns and generating concrete operational traces.

Implications and Prospects

Theoretical implications: The SDB and its associated methodology shift the locus of agent system reliability from model capabilities to runtime architecture, aligning agent engineering with mature distributed systems principles while retaining LLM-specific stochasticity controls.

Practical implications: Agent system teams are now equipped with a reproducible, auditable methodology and diagnostic catalog that can be applied agnostically across frameworks and model generations, ensuring system actions remain legible and tractable even as models evolve.

Future patterns: The field is expected to develop new runtime patterns for shared memory, tenant isolation, and cross-runtime handoffs, guided by the SDB-centered discovery procedure outlined.

Conclusion

This paper establishes the stochastic-deterministic boundary as the essential architectural primitive for LLM agent systems, systematizing the contract where stochastic model outputs become determinate system actions. By introducing a taxonomy of runtime concerns, a working catalog of architecture patterns, and a validated methodology for pattern selection and failure diagnosis, the work enables practitioners to render agent reliability a function of explicit engineering choice rather than latent model traits. The catalog and methodologies are designed to evolve, but the SDB framework is positioned as the central invariant for production agent architectures in the era of increasingly determinate LLMs (2605.20173).

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 22 likes about this paper.