Agent-Centric OS Architectures (ACOS)
- ACOS is a novel operating system paradigm that leverages agent-driven modules, such as LLMs and vision-language models, for integrated reasoning and decision-making.
- It replaces traditional kernels with dynamic agents managing memory, scheduling, and resources through modular APIs and context-aware workflows.
- Empirical implementations like ColorAgent and SchedCP demonstrate significant performance gains and enhanced task success rates using ACOS metrics.
Agent-Centric OS Architectures (ACOS) redefine the traditional operating system (OS) paradigm by positioning agents—typically LLMs or vision-LLMs augmented with domain knowledge—as the principal actors managing reasoning, decision-making, and execution at the core of the OS. Rather than focusing on passive, statically-defined system calls and human-centric interaction paradigms, ACOS frameworks centralize control, memory, planning, and resource allocation within a modular, agent-driven ecosystem. This approach underpins the development of intelligent, context-aware, and proactive operating systems, as shown in architectures ranging from the AIOS model (“LLM as OS”), multi-agent personalized OS agents on mobile platforms, to agent-driven workload scheduling in kernel-space environments (Ge et al., 2023, Li et al., 22 Oct 2025, Zheng et al., 1 Sep 2025).
1. Formal Framework and Architectural Principles
ACOS is characterized by the reimagining of canonical OS components as agent-mediated modules. The LLM or multimodal foundation model, functioning as the "kernel," replaces traditional monolithic kernel abstractions. Key correspondences include:
- Kernel → LLM core services, exposed via natural language and tool-invocation primitives;
- Memory management → Context window allocation and dynamic history tracking;
- File system → Long-term vector storage, embedding retrieval, and retrieval-augmented generation;
- Devices/peripherals → Modular APIs, tool-drivers, sensor/actuator integrations;
- Middleware/libraries → Specialized toolchains and API services linkable at runtime;
- User commands → Natural-language programmatic interfaces or structured prompt templates.
Within this framework, an action performed by the agent-kernel is formalized as , with as the current history, as the current prompt or sub-task, and as candidate actions. Memory management is achieved through relevance-driven context window selection: (Ge et al., 2023).
ACOS enforces agent instantiation, resource scheduling, inter-agent communication, and memory management through analogues of classic process scheduling—e.g., the round-robin or priority-based scheduler, but defined in terms of token/window allocation and agent orchestration(Ge et al., 2023).
2. Systems Implementations and End-to-End Workflow
ColorAgent exemplifies mobile-centric ACOS by integrating four interacting "planes": (1) Execution Module (a vision-language LLM), (2) Task Orchestration & Memory Management, (3) Knowledge Retrieval, and (4) Hierarchical Reflection & Error Recovery. The workflow proceeds as follows:
- Task Orchestration: Incoming user instructions are classified as simple or composite; complex tasks are decomposed into subtasks, which are processed with context propagation.
- Knowledge Enrichment: Each task/subtask is augmented with relevant external knowledge via a retriever module.
- Execution via GUI-LLM: The Execution Module processes the enriched prompt, emitting structured actions (e.g., JSON sequences) that are then executed on the OS.
- Reflection and Correction: Post-action, screenshots and action summaries are analyzed by hierarchical reflectors (per action, short trajectory, global session) for errors, with feedback loops for recovery if discrepancies are observed(Li et al., 22 Oct 2025).
For system-level OS interaction, SchedCP demonstrates an ACOS realization focused on Linux scheduler policy optimization. The architecture is bifurcated into:
- Control Plane (SchedCP + LLM Agent): Handles semantic reasoning, workload analysis, policy generation, and adaptation.
- Data Plane (Linux kernel with sched_ext/eBPF): Executes agent-synthesized policies, performs resource accounting, and provides performance feedback.
- Core Services: Comprise a Workload Analysis Engine (performance data aggregation), a Scheduler Policy Repository (eBPF code storage and retrieval), and an Execution Verifier (safety and semantic correctness checks before policy deployment)(Zheng et al., 1 Sep 2025).
3. Agent Coordination, Learning Mechanisms, and Planning
Robust collaboration and learning mechanisms underpin ACOS robustness:
- Multi-agent Orchestration: Agents decompose, sequence, and coordinate subtasks, dynamically revise plans with retrieved knowledge, and embed failure-recovery via reflection (see AgentCoordinator pseudocode in ColorAgent)(Li et al., 22 Oct 2025).
- Step-wise RL and Self-Evolving Training: ColorAgent employs a two-stage methodology—(i) Step-wise reinforcement learning (RL) with group-relative policy optimization (GRPO), processing GUI interaction trajectories as single-step RL samples with rewards comprising action accuracy and format compliance; (ii) Self-evolving data generation/retraining loop, where a seed query pool is expanded, rollouts are filtered by discriminators, and high-quality data drives iterative fine-tuning(Li et al., 22 Oct 2025). The loss per iteration combines supervised and RL objectives.
- Planning and Verification in SchedCP: Agent workflows modularize into observation, planning/policy retrieval, execution, verification, and learning agents. All proposed eBPF scheduler code is validated through multi-stage static and dynamic analysis prior to deployment, ensuring both correctness and operational safety(Zheng et al., 1 Sep 2025).
4. Personalization, Proactive Interaction, and Human-Agent Synergy
ACOS systems increasingly incorporate mechanisms for user-specific adaptation and proactive engagement:
- Personalized Intent Recognition: Explicit per-user knowledge bases (query→SOP mappings) and implicit user profiles (embedded from interaction trajectories) are used to rewrite and tailor incoming queries before execution(Li et al., 22 Oct 2025).
- Proactive Engagement: "AskAgent" functionality is realized through meta-knowledge decoupling, training the system to request user clarification only when necessary, thus optimizing intent alignment and success rates(Li et al., 22 Oct 2025).
- ColorAgent reports an Intent Alignment Rate (IAR) increase to 58.7% (MobileIAR) and a step-wise Success Rate of 68.98% on VeriOS-Bench with such strategies.
The AIOS model frames the user as an additional privileged agent, capable of pre-empting or supplementing agent context in mixed-initiative or conversational modes(Ge et al., 2023).
5. Evaluation Metrics and Benchmark Results
Empirical validation of ACOS frameworks leverages both standard and agent-specific metrics:
- Task Success Rate (SR): ColorAgent shows measurable gains on AndroidWorld (AWorld) and AndroidLab (ALab) benchmarks, with incremental improvements from step-wise RL (63.0%/45.2%), self-evolving training (65.1%/48.6%), reflection (70.3%/49.5%), orchestration (72.8%/50.1%), to full knowledge retrieval (77.2%/50.7%)(Li et al., 22 Oct 2025).
- Scheduler Optimization: SchedCP achieves up to 1.79× speedup in kernel compilation time, 2.11× reduction in p99 scheduling latency, and 20% average end-to-end time reduction in AI-generated batch workloads. The cost per workload demonstrates a 13× reduction (from $6 to$0.5), with 100% success in generating valid custom schedulers. In contrast, naïve agentic methods fell below 33% success and often produced performance regressions(Zheng et al., 1 Sep 2025).
- Other Metrics: ACOS evaluation methods include tool-use accuracy, Jain fairness index for context allocation, retrieval recall@k, and adversarial robustness measures(Ge et al., 2023).
Benchmark limitations are observed, including narrow task coverage, lack of robust user-alignment and recovery metrics, and insufficient simulation of dynamic UI anti-patterns(Li et al., 22 Oct 2025).
6. Design Principles and Research Directions
Several architectural and methodological principles have been distilled from ACOS research:
- Modularity and Decoupling: Separating reasoning (LLM/core agent) from system-level execution via abstracted APIs and verifiers enhances robustness and security. The modular "plane" design is consistently observed in both ColorAgent and SchedCP implementations(Li et al., 22 Oct 2025, Zheng et al., 1 Sep 2025).
- Interleaved Training and Lifelong Learning: Effective ACOS frameworks blend supervised representation learning, on-policy RL, and agent-driven experience generation to bridge gaps between static benchmarks and real-world, dynamic environments(Li et al., 22 Oct 2025).
- Resource and Memory Management: Analogous to DRAM swapping, methods for context window virtualization, memory sharing across agents, and disaggregated memory for distributed ACOS nodes are prominent research fronts(Ge et al., 2023).
- Communication Protocols and DSLs: Domain-specific languages for tool invocation and inter-agent plans allow for structured agent communication and enable static safety analysis.
- Security, Alignment, and Formal Verification: Multi-stage verification regimes—including symbolic sandboxes, static program analyses, adversarial simulation, and DSL plan validation—are critical for operational trustworthiness.
- Generalizability: The ACOS agent-control paradigm is applicable to systems beyond schedulers, including cache policy, DVFS governors, network queuing, and co-optimization of multi-resource environments(Zheng et al., 1 Sep 2025).
- Research Roadmap: Key developmental stages trace the transition from batch LLM inference to interactive CoT, from single-agent to multi-agent collaboration, the incorporation of DSL-driven control, and robust, verified agent execution(Ge et al., 2023).
7. Taxonomy of Agent Applications and Ecosystems
The ACOS approach enables a broad taxonomy of agent-driven applications, including:
- Single-Agent Systems: Autonomous agents interacting with external APIs or physical devices for code generation, web navigation, robotics, or task automation. Control architectures often deploy CoT or more advanced graph-based planning.
- Multi-Agent Systems (MAS): Specialized agents coordinate via collaborative or adversarial protocols (e.g., MetaGPT, ChatDev, Multi-Agent Debate), utilizing global blackboard architectures or debate-style feedback loops.
- Human-Agent Interaction Modes: Ranging from fully automated operation, to mixed-initiative workflows (user and agent interleave control), to conversational, dialog-driven interfaces. The ability to dynamically allocate control and context enables flexible adaptation to diverse OS environments(Ge et al., 2023).
The AIOS-Agent ecosystem model provides an end-to-end realization: the LLM kernel instantiates, schedules, and manages agents as primary OS abstractions, offering a distinct shift from the traditional OS–APP model.
References
- "LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem" (Ge et al., 2023)
- "ColorAgent: Building A Robust, Personalized, and Interactive OS Agent" (Li et al., 22 Oct 2025)
- "Towards Agentic OS: An LLM Agent Framework for Linux Schedulers" (Zheng et al., 1 Sep 2025)