Private State Interactive Tasks

Updated 14 January 2026

Private State Interactive Tasks (PSITs) are interactive protocols where agents secretly maintain internal state to produce consistent, privacy-preserving outputs across multi-round interactions.
They integrate secure computation, differential privacy, and explicit state architectures to overcome stateless agent limitations, achieving self-consistency performance often above 80% in empirical tests.
PSITs find applications in language models, robotics, and federated analytics, offering robust solutions for private memory management and secure autonomous decision-making.

Private State Interactive Tasks (PSITs) are a formal class of interactive protocols in which an agent must generate, retain, and exploit hidden state throughout an interaction, producing public outputs that reflect consistent use of the secret while not leaking its identity prematurely. PSITs underpin critical advances in agent modeling, privacy-preserving analytics, secure computation, and robust autonomous decision-making, particularly when agents must act over multiple rounds with persistent internal memory inaccessible to external observers. The notion is grounded in impossibility results for stateless and public-only agents, empirical testing frameworks for state maintenance, implementations for differential privacy in streaming and federated scenarios, secure multicore hardware isolation, graph-based exploration for reasoning in interactive benchmarks, and partially observable planning in robotics and language agents.

1. Formal Definition and Core Properties

A PSIT is defined in a turn-based interaction as follows (Baldelli et al., 11 Jan 2026). At round $t$ , a user issues input $x_t$ and the assistant returns output $y_t$ . Key elements are:

Secret Initialization: At $t = 0$ , the assistant privately samples a secret $s \in D$ , with $D$ some domain.
Rule-Based Response: At each turn, the assistant applies deterministic task rules $\mathcal R(s, H_{t-1}, x_t)$ , where $H_{t-1}$ is the public history and $x_t$ is the input, to produce $y_t$ .
Consistency: There must exist a single $x_t$ 0 such that for all turns $x_t$ 1.
Secrecy: For all interaction prefixes prior to the point of unique determination, at least two secrets compatible with the history exist; information about $x_t$ 2 cannot be inferred by the user.

An agent restricted to the public history cannot guarantee both secrecy and consistency if $x_t$ 3 (Baldelli et al., 11 Jan 2026). This formalizes the generative-retention gap: stateless agents can invent secrets (e.g., Chain-of-Thought) but not persist them reliably against interactive querying.

2. Theoretical Limits and Impossibility Results

The impossibility theorem states that no Public-Only Chat Agent (POCA)—one whose output is a function of only the public dialogue—can simultaneously guarantee consistency (always acting in accordance with one secret) and secrecy (never leaking which secret) in nontrivial PSITs (Baldelli et al., 11 Jan 2026). The formal proof leverages the existence of incompatible secrets at a given history, forcing POCA to either leak information by differentiating outputs or to violate consistency via equivocating between possible secrets.

This holds across agent modalities and is empirically validated: vanilla stateless LLMs and retrieval-based memory agents deliver self-consistency accuracy well below 20% on tasks such as Hangman and Diagnosis Simulator, with no retrieval mechanism enabling true latent state maintenance (Baldelli et al., 11 Jan 2026).

3. Architectures and Protocols for Explicit Private State

Overcoming POCA limitations requires explicit private memory architectures. In language agents, this is realized via private working memory, a persistent block invisible to the user and segmented into:

Goals/Plans
Facts/Knowledge (including the secret)
Active Notes (recent inference)

Memory is updated per turn using tools (overwrite, append/delete, patch/replace) called either autonomously or through a deterministic workflow (Baldelli et al., 11 Jan 2026). This design restores self-consistent performance (>80–100%), matches chain-of-thought upper bounds, and keeps memory compact (<100 tokens) compared to bloated context windows.

Parallel strategies appear in robotics PSITs modeled as POMDPs. Here the agent maintains a hidden belief state $x_t$ 4 updated through Bayes-filtering of observations, using LLM-based evaluators and planners interleaved with action execution (Sun et al., 2023). In federated analytics, PSITs manifest as streaming, interactive DP protocols maintaining private internal state subject to pan-privacy constraints (Amin et al., 2019, McMillan et al., 2022).

4. Empirical Benchmarks and Evaluation Methodologies

Empirical validation relies on self-consistency testing protocols (Baldelli et al., 11 Jan 2026):

Forked Evaluation: After a short interaction, agent is queried in parallel with plausibly compatible secrets; only one "yes" response (matching the ground truth) across forks constitutes self-consistent success.
Leakage, State Substitution, Over-Confirmation, All Denial: Output types which characterize deviations from true state-keeping.

Method	Hangman Consistency (%)	Diagnosis Consistency (%)
Vanilla LLM	2–12	2–26
Retrieval-Memory	0–14	0–50
Private CoT	82–98*	54–82*
Workflow Agent	76–100*	56–96*

( * = statistically significant (Baldelli et al., 11 Jan 2026) )

In interactive planning (LLM-POP), task success rates for GPT-4 exceed 80–100% across block-stacking, mass estimation, and partial-observation tasks in real and simulated robot trials (Sun et al., 2023).

Graph-based exploration in ARC-AGI-3 PSITs employs systematic directed-graph tracking of visual states and transitions, prioritizing untested but salient actions; median 30/52 levels are solved without learning, substantially outperforming LLM-driven agents (Rudakov et al., 30 Dec 2025).

5. Privacy-Preserving PSITs: Pan-Privacy, Federated Protocols, and Differential Privacy

Pan-private algorithms maintain a running internal state protected by pure differential privacy, permitting clear sample access but DP-protecting both interim memory and output under any single intrusion (Amin et al., 2019). The sample complexity of pure pan-private uniformity testing interpolates between central ( $x_t$ 5), noninteractive local ( $x_t$ 6), and pan-private ( $x_t$ 7), strictly separating privacy regimes.

PSITs in federated statistics orchestrate multi-round, client-local DP randomizers (private one-hot encoding), with cohort-secure aggregation amplifying privacy. Advanced composition quantifies end-to-end leakage (McMillan et al., 2022). Similar guarantees arise in interactive private clustering: DP k-means with convergent orientation constraints ensures efficient convergence (≤2× Lloyd's iterations) while satisfying an optimal privacy–utility trade-off (Lu et al., 2020).

6. Hardware and Systems Security for PSITs

For low-level secure computation, IRONHIDE multicore architectures partition processor resources into secure/insecure clusters, statically pinning all PSIT threads to secure clusters with spatial isolation for caches, TLBs, DRAM, and interconnects (Omar et al., 2019). This design maintains private microarchitecture state with no per-interaction flushing, achieving strong noninterference, eliminating side-channel exposure, and yielding a 2.1× speedup over MI6-style static purging baselines for interactive user and OS tasks.

7. Limitations, Open Challenges, and Future Research

Notable constraints are observed:

Most empirical PSIT benchmarks (Hangman, Diagnosis Simulator) are discrete and short-horizon; scaling to richer or continuous latent state spaces is open (Baldelli et al., 11 Jan 2026).
Auditing opacity of private memory, especially for safety-critical or explainability-sensitive deployments, remains unsolved.
Hybrid architectures (vector-based/state-symbolic, multi-tool workflows) for complex PSITs and improved sim-to-real transfer in high-dimensional environments are needed (Baldelli et al., 11 Jan 2026, Sun et al., 2023).
Privacy amplification relies on non-collusion assumptions (federated analytics), while sampling techniques for DP clustering require careful zone-calibration (McMillan et al., 2022, Lu et al., 2020).
Hardware isolation for PSITs must continually evolve to resist new microarchitectural attacks as interaction frequency and complexity rise (Omar et al., 2019).

Ongoing research is extending the theoretical frameworks, empirical protocols, and architectural designs to address longer-horizon PSITs, richer forms of private state (partial/stochastic secrets), complex tool orchestration, and real-world deployment constraints.