Factored Controller with Typed Interfaces
- The paper introduces a factored controller that decomposes high-level human-robot dialogue into statically-typed, auditable modules for robust interaction.
- It employs a POMDP framework with explicit type-checking and functional mappings, enabling clear input/output contracts and context persistence.
- The approach enforces evidence-based outputs through faithfulness constraints and controlled memory design, ensuring reliable, verifiable system claims.
A factored controller with typed interfaces is a systems architecture for sequential decision processes, applied in the JANUS cognitive assistant to decompose high-level human-robot interaction (HRI) into statically-typed, auditable modules with clearly defined input/output contracts. This approach enables persistent context, robust clarification of underspecified requests, evidence-grounded responses, and guarantees of verifiability and modularity over extended interactions. The architecture models dialogue as a partially observable Markov decision process (POMDP), with controller design centered on explicit, type-checked reasoning steps, agentic memory persistence, and faithfulness constraints that enforce evidence-based claims in system outputs (Belcamino et al., 31 Jan 2026).
1. POMDP Formulation and State Factorization
The interaction loop is formalized as a POMDP , where is the latent interaction-state space, the action space, the observation space, the state transition kernel, the observation model, the reward function, and the discount factor. At each dialogue turn , the state is factored as:
- : active domain
- : underlying human goal
- : intent schema
- : parameter assignment for intent
- : memory, decomposed into recent history , compact core , and an archival store .
This structured factorization underpins decomposition of the policy into well-specified modules, each transforming and type-checking contextual variables, rather than learning a monolithic .
2. Functional Decomposition: Factored Modules and Typed Interfaces
JANUS operationalizes factored control through a pipeline of statically-typed intermediate variables and functional mappings, each described by type signatures:
| Module | Input(s) | Output(s) / Signature |
|---|---|---|
| Scope Detection | ||
| Intent Recognition | ||
| Intent Postprocess | ||
| Memory Retrieval | ||
| Inner Speech | ||
| Query Generation | ||
| Tool Execution | ||
| Outer Speech | ||
| Memory Update |
Each module’s typed interface enforces correct structuring of information flow between controller steps. Typed outputs are statically checked; Intent Recognition must output a single intent schema and a dictionary of named slots matching that schema; Inner Speech explicitly gates downstream processing.
3. Control-Flow, Gating, and Constraints
Critical gating decisions are codified via predicates:
- Information-sufficiency ():
gates whether the working context is assembled locally or augments with archival retrieval. A top- similarity search produces , with constraints and .
- Execution-readiness ():
ensures that fully-typed parameters are present. ; otherwise must be set to \emph{Clarify} or \emph{Reject}. No missing parameter is silently defaulted.
- Tool-grounding ():
checks if sufficient evidence exists for a tool call. is defined by non-satisfaction of evidence requirements in .
Modules such as Inner Speech implement these gating predicates, controlling whether tool execution and outer speech proceed or a clarification is issued.
4. Data-Flow and Module Synchronization
The single-turn data flow follows an explicit, repeatable sequence:
- If , then ; , else .
Median latencies for JANUS modules in dietary-assistant experiments were: SD (0.37s), IR (0.26s), IS (0.81s), QG (0.74s), OS (0.27s); turn-level: domain-route (0.37s), clarification (1.7s), answer (2.5s) (Belcamino et al., 31 Jan 2026).
5. Memory Design and Controlled Consolidation
JANUS introduces a memory agent, factored into three roles:
- : bounded recent history buffer (prompt-sized, rapid lookup)
- : compact core memory (semantically deduplicated, capacity-limited)
- : archival store (indexed for semantic retrieval, not in immediate working set)
Controlled consolidation and revision operators manage transfer, deduplication, and contradiction-resolution (e.g., new user facts supersede older entries). Capacity constraints , , and fixed- retrieval for enforce scalability and predictable computational cost.
6. Evidence Grounding, Faithfulness, and Auditable Reasoning
A central design constraint is that all system claims made in outer speech during Proceed turns must be verifiable from an evidence bundle . For each natural language response , denoting the set of atomic claims as and the set of supported claims as , the following faithfulness constraint is enforced:
This conjunction of typed interfaces, explicit slot checks, and evidence-based claim restriction guarantees verifiable interaction, eliminating silent parameter defaulting and restricting claims to those justified by the working context or known safe defaults.
7. Scalability, Modularity, and Domain Extensibility
By isolating core modules with typed input–output signatures, the architecture allows each module to be independently implemented (e.g., via LLM-based classifier, prompt-template, or domain-specific LM), so long as type contracts are honored. A domain-customization layer separates schemas, tools, and prompt templates, enabling new domains or capabilities without disrupting existing logic. No global optimizer is assumed; module-level independence enhances robustness and enables targeted improvements.
The architecture has demonstrated high agreement with curated references in domain-specific dietary tasks, alongside practical latency profiles, supporting factored reasoning as a tractable path to scalable, auditable, and evidence-grounded robot assistance over multi-turn horizons (Belcamino et al., 31 Jan 2026).