KUBEDIRECT: Efficient FaaS Control for Kubernetes
- KUBEDIRECT is a lightweight retrofit that redefines the narrow waist of the Kubernetes control plane for FaaS, directly connecting controllers to reduce API Server overhead.
- It employs a five-stage direct message-passing mechanism between Autoscaler, Deployment, ReplicaSet, Scheduler, and Kubelet to achieve sub-millisecond provisioning.
- The system uses a hierarchical write-back cache with a two-round handshake to ensure cache coherence, rapid state recovery, and improved overall latency.
KUBEDIRECT is a lightweight retrofit of the Kubernetes control-plane designed to overcome the API Server bottleneck in Function-as-a-Service (FaaS) deployments on Kubernetes clusters. It achieves this by reorganizing the critical path of state transitions—termed the "narrow waist"—into a sequence of direct message-passing channels between controllers, thereby eliminating serialization and persistence overheads associated with the traditional API Server-based design. KUBEDIRECT preserves compatibility with the broader Kubernetes ecosystem, requiring only minor patches per controller, and demonstrates substantial improvements in end-to-end function serving latency, closely matching the performance of purpose-built platforms while maintaining standard API and extension compatibility (Qi et al., 27 Jan 2026).
1. Motivation: API Server Bottleneck in FaaS Workloads
Kubernetes-based FaaS platforms such as Knative, OpenFaaS, and Fission deploy a chain of core controllers—Autoscaler, Deployment, ReplicaSet, Scheduler, and Kubelet—to orchestrate Pod lifecycle events. Each controller stage traditionally communicates with others by writing resource state objects (often ~17 KB each) to the API Server (backed by etcd) and awaiting notifications. Although internal controller logic executes in milliseconds, the necessity to exchange vast numbers of large serialized state objects leads to persisting, serialization, and rate-limiting inefficiencies. These bottlenecks are exacerbated under bursty FaaS conditions: Azure Functions traces report workloads reaching 50,000 cold starts per minute, where each cold start stalls incoming user requests.
KUBEDIRECT’s analysis identifies two key properties at the narrow waist of this pipeline:
- Persistence is unnecessary, as partially provisioned Pods are fungible and failure can be recovered without strict consistency.
- Conflict resolution via consensus is redundant, with each object possessing a unique writer and reader in direct sequence.
This insight enables bypassing the API Server exclusively for the narrow waist, while retaining persistent and externally visible state for other API objects (Qi et al., 27 Jan 2026).
2. System Architecture: Direct Pairwise Controller Channels
KUBEDIRECT’s architecture decomposes the narrow-waist control path into five sequential stages:
- Autoscaler updates Deployment.replicas.
- Deployment controller adjusts ReplicaSet.replicas.
- ReplicaSet creates new Pod specifications.
- Scheduler assigns Pod.nodeName.
- Kubelet instantiates containers and marks Pods as Ready.
Between the first four stages, the platform replaces the API Server-mediated write-notify pattern with dedicated, bidirectional TCP channels. Communication is reduced to transmitting minimal, typed messages containing only the dynamic delta (e.g., { key: "podX.status", value: Ready }) alongside references to cached static fields, instead of full API objects.
Each controller implements a "dynamic materialization" module at ingress, reconstructing the requisite API object in situ by merging received deltas with local state. At egress, outgoing updates are intercepted and encoded as deltas before being relayed. This approach eliminates centralized truth in etcd for this control path segment and supports immediate local consistency at each controller.
3. Consistency via Hierarchical Write-Back Cache
With the API Server bypassed at the narrow waist, controller state becomes ephemeral and distributed, raising challenges for correctness and eventual consistency. KUBEDIRECT implements the narrow waist as a hierarchical write-back cache with the following properties:
- Forward Path (Fast-forward): Upstream controllers opportunistically send state deltas downstream.
- Soft Invalidation: Downstream controllers notify upstream of local state changes, such as Pod cancellation or eviction.
- Hard Invalidation (Two-Round Handshake): Upon controller restart or reconnection, the downstream controller acts as source of truth, streaming its entire local state upstream. Upstream nodes either recover or reset their caches accordingly, ensuring initial state consistency.
The two-round handshake is implemented as follows:
1 2 3 4 5 6 7 8 9 10 11 12 |
upon Connect() to downstream:
send HELLO
receive state_downstream
if local_cache is empty: // recover mode
local_cache ← state_downstream
else: // reset mode
for obj ∈ local_cache:
if obj ∈ state_downstream:
overwrite and mark dirty
else:
mark invalid (hide from control loop)
send ACK to downstream |
- Safety (Cache Coherence): If a downstream controller observes predicate (e.g., Pod X assigned to node Y), will eventually hold at all upstreams.
- Liveness: The narrow waist becomes fully connected infinitely often for long enough to exchange at least one round of state messages, guaranteeing convergence of global desired and actual Pod counts (Qi et al., 27 Jan 2026).
4. Kubernetes Integration and Ecosystem Compatibility
KUBEDIRECT is implemented as both a library and an admission webhook, requiring 150 lines of Go code per narrow-waist controller—Autoscaler, Deployment, ReplicaSet, Scheduler, and Kubelet. Each is augmented with ingress/egress modules for direct message handling. Admission control restricts core .replicas field updates to the KUBEDIRECT path, safeguarding against out-of-band mutations.
All standard Kubernetes APIs (Pods, ReplicaSets, Deployments, Services, Endpoints) remain unmodified. Integrations with external extensions such as Istio, Prometheus, and CNI plugins continue to operate without alteration. High availability is achieved by running the two-round handshake upon controller leader election, supporting failover within the existing controller framework.
5. Performance Evaluation and Results
Experiments conducted on an 80-node CloudLab cluster (10 cores, 64 GB RAM, 25 Gbps networking per node) compare KUBEDIRECT against two baselines: stock Kubernetes (v1.32) with Knative v1.15, and the clean-slate FaaS platform Dirigent.
Key performance metrics and findings include:
- Microbenchmarks: Scaling 100–800 Pods in a single Deployment (over 80 nodes), KUBEDIRECT’s control plane achieves 3.7–16.9× speed-up versus K8s. With Dirigent’s sandbox (Kd+), sub-second latency is achieved, matching Dirigent’s raw performance.
- Function Diversity Scaling: For 100–800 distinct functions (one Pod each) on 80 nodes, speed-ups range from 7.4–32.8× over baseline Kubernetes.
- Large-Scale Node Expansion: Scaling to 20,000 Pods across 500–4,000 nodes requires 30 seconds end-to-end; scheduling step latency remains a limiting factor as it inspects all nodes.
- End-to-End FaaS Workload: Processing a 30-minute Azure Functions trace (500 functions, 168,000 invocations), KUBEDIRECT integrated into Knative reduces median user request slowdown by 3.5× and p99 by 19.4×; median scheduling latency drops by 26.7× (p99 10.3×).
- Cross-Platform Analysis: Substituting KUBEDIRECT for K8s+ (Dirigent's Kubelet) results in a 2.0× improvement in median slowdown (10.4× at p99) and a 6.6× speed-up in median scheduling latency (134× at p99), retaining parity with Dirigent’s performance.
Naive direct passing of full API objects (as opposed to delta-encoding) is shown to be 20–35% slower. The two-round handshake’s overhead on recovery/reconnection is sub-linear, proportional only to the number of batched messages.
| Comparison | Latency Reduction over Baseline | Notable Impact |
|---|---|---|
| KUBEDIRECT vs. Knative | Up to 26.7× | Median/p99 scheduling latency |
| KUBEDIRECT vs. Dirigent | Parity | Sub-second Pod provisioning |
| Direct full object passing | 20–35% slower | Inefficient vs. delta-encoding |
6. Limitations, Implications, and Future Directions
KUBEDIRECT demonstrates that it is feasible for legacy, state-centric cluster managers to regain millisecond-scale control-plane performance in serverless settings without sacrificing API compatibility or requiring fundamental rewrites of core components such as schedulers, network plugins, or service meshes. The total codebase modification is approximately 3.8k lines of Go, anchored by a stable narrow waist (Autoscaler→ReplicaSet→Scheduler→Kubelet) across Kubernetes releases, suggesting long-term maintainability (Qi et al., 27 Jan 2026).
Planned extensions and research directions include:
- Insertion of custom middle controllers (e.g., for node-specific sidecar injection).
- Application of the same strategy to other state-centric cluster managers (e.g., Omega, Twine).
- Horizontal partitioning of the narrow waist in extremely large clusters to improve scalability.
- Integration of lightweight observability hooks for enhanced introspection.
By recasting the FaaS control path as a hierarchical write-back cache and transmitting only dynamic deltas, KUBEDIRECT provides sub-millisecond provisioning at cloud scale in conjunction with full compatibility with the cumulative Kubernetes ecosystem and extensions.