Semantic policy enforcement for low-level CUA tools

Develop semantic security policies and the necessary supporting infrastructure for low-level Computer Use Agent tools such as click and find, enabling effective data-flow restrictions during plan execution; concretely specify mechanisms like planner-provided intent annotations on tool calls or website-provided metadata restricting permissible actions to make such policies enforceable.

Background

Within the Dual-LLM architecture, control flow is isolated by separating privileged planning from quarantined perception, but data flow remains vulnerable because the perception model’s outputs can steer execution. Standard information-flow security policies can mitigate such data-flow risks in domains with semantically rich tools.

For Computer Use Agents, however, low-level tools like click and find lack intrinsic semantics, making it difficult to define and enforce meaningful policies. The authors therefore employ redundancy-based verification as a best-effort measure and explicitly note that extending semantic policies to CUA tools requires additional infrastructure, which remains an open research direction.

References

Extending semantic policies to CUA tools remains an open research question that would require additional infrastructure such as planner-provided intent annotations on tool calls or website-provided metadata restricting permissible actions as is proposed by~\citet{meng2025cellmate}.

CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents  (2601.09923 - Foerster et al., 14 Jan 2026) in Section “Additional Defenses through Redundancy”