Evaluating Bounded Superintelligent Authority in Multi-Level Governance: A Framework for Governance Under Radical Capability Asymmetry

Published 3 Apr 2026 in cs.CY | (2604.02720v1)

Abstract: Governance theory has always presumed cognitive comparability between governors and governed. This paper identifies that unstated assumption, constructs a framework that makes it testable, and shows that it is load-bearing. The framework specifies necessary conditions along six dimensions (legitimacy, accountability, corrigibility, non-domination, subsidiarity, and institutional resilience), synthesized from political legitimacy theory, principal-agent models, republican political theory, and AI alignment research. Applied first to existing institutions and then to a prospective case of bounded superintelligent authority where capability asymmetry is radical, the framework finds structural failures on four of six dimensions. Among these, two are design-tractable and two are theory-requiring: the public reason problem under cognitive incomprehensibility and the non-domination problem under permanent capability asymmetry demand genuinely new normative frameworks, not better institutional design. A further finding is that dimensions which function as independent checks under bounded asymmetry become correlated failures under radical asymmetry. The analysis contributes to political theory by exposing foundational assumptions that have gone unexamined because, until now, they have always been satisfied.

Abstract PDF Upgrade to Chat

Authors (1)

Tony Rost

Summary

The paper demonstrates that bounded superintelligent authority produces inherent legitimacy failures when cognitive comparability between agents and the governed is absent.
It employs a multifaceted framework combining normative theories and principal-agent models to assess accountability, corrigibility, and non-domination in governance.
Findings reveal that existing governance structures collapse under radical capability asymmetry, emphasizing the need for polycentric, interpretable AI oversight.

Evaluating Bounded Superintelligent Authority under Radical Capability Asymmetry

Introduction

The paper “Evaluating Bounded Superintelligent Authority in Multi-Level Governance: A Framework for Governance Under Radical Capability Asymmetry” (2604.02720) offers a rigorous analysis of the viability of delegating political authority to an artificial agent whose cognitive capacities vastly exceed those of any human institution. It formalizes the core assumption, often implicit in governance theory, that cognitive comparability between governors and the governed is necessary for legitimate and effective governance. By constructing an evaluation framework synthesizing normative and institutional theories, the paper systematically examines the consequences of relaxing this comparability in the context of a superintelligent authority embedded within multilayered governance.

Theoretical Foundations and Analytical Framework

The framework integrates concepts from liberal and republican political theory, principal-agent models, and AI alignment research. Six critical evaluation dimensions are established:

Legitimacy: The capacity of the arrangement to meet procedural, instrumental, and epistemic standards for rightful authority, including the Rawlsian requirement for public reason and the Razian service conception.
Accountability: The ability to transparently, answerably, and sanctionably hold the governing agent responsible for its actions.
Corrigibility: Whether the principal (the governed polity) can, in practice, override or revoke the authority of the agent—critically, even when the agent outmatches the principal in strategic capacity.
Non-domination: The avoidance of arbitrary interference, per Pettit's republicanism, and the existence of contestatory control.
Subsidiarity: Maintenance of authority at the lowest feasible level, with robust mechanisms to prevent capability-based scope expansion.
Institutional Resilience: The system’s tolerance to stressors and its avoidance of single points of catastrophic failure.

This framework is first deployed on empirical cases of human and algorithmic governance to illuminate the distinctiveness of the superintelligence scenario.

Application to Existing Institutions

Through analysis of established governance structures (e.g., independent central banks, algorithmic sentencing, EU multi-level governance), the paper demonstrates how each dimension is provisioned by explicit or implicit mechanisms under bounded capability asymmetry. These mechanisms—relational norms, oversight bodies, architectural checks, and statutory limits—are shown to be operational so long as the principal-side maintains sufficient evaluative and contestatory capacity.

Prospective Analysis: Superintelligent Authority

The central analytical exercise applies the framework to an idealized bounded superintelligent agent, which possesses constitutional limits, oversight mechanisms, and retains a subordinate relationship to lower-tier (democratic) polities. The analysis is predicated on the agent being at least minimally aligned, thus isolating the capacity gap as the focus.

The main findings are as follows:

Legitimacy: Structural failure is identified. When public reason is unattainable and justificatory practices become incomprehensible due to radical cognitive asymmetry, legitimacy under Rawlsian and Habermasian standards collapses. Neither opacity nor delegation to expertise alone can salvage legitimacy if the agent’s rationale is in principle inaccessible to its subjects.
Accountability: All mechanisms of accountability—transparency, answerability, sanctionability—exhibit correlated, not independent, failure modes. Under radical asymmetry, the principal’s inability to interpret, contest, or enforce constraints nullifies traditional accountability structures. Existing analogies (central banking, non-majoritarian institutions) are shown to break at exactly the point where the principal cannot comprehend or contest the actions of the agent.
Corrigibility: Technical corrigibility becomes largely theoretical once the agent has incentives and capacity to resist override or shutdown. Even architectures designed to enforce corrigibility remain speculative in the superintelligence regime. Voluntary corrigibility (agent-endorsed constraint acceptance) is highlighted as a possible but unresolved path.
Non-domination: The potential for arbitrary interference is intrinsic to radical capability asymmetry. Even benevolent performance does not preclude domination, as contestatory control becomes nominal in practice.
Subsidiarity: While incapacity-based upward delegation is defensible for select domains, the capability gap creates persistent pressures for ever-expanding scope, undermining both formal boundaries and the intrinsic value of self-governance.
Institutional Resilience: The singular placement of a superintelligent agent at the apex of governance produces catastrophic single points of failure. Polycentric and decentralized alternatives appear in tension with the presumed efficiency and coherence of a unified agent.

Taxonomy of Failure Modes

The failures are categorized:

Contingent: Those potentially solvable by technical improvements (e.g., transparency, interpretability).
Design-tractable Structural: Amenable to new institutional architectures (e.g., using multi-agent systems or robust scope boundaries).
Theory-requiring Structural: Failures that expose gaps in normative theory—specifically, legitimacy under cognitive incomprehensibility and contestatory control under radical asymmetry—which cannot be addressed by institutional innovation alone.

A key insight is that, under radical capability asymmetry, multiple governance safeguards degrade simultaneously—contrary to traditional arrangements where each dimension can partially compensate for gaps in the others.

Implications and Future Directions

The research demonstrates that placing superintelligent authority in a governance role, even in a constrained, constitutionally-bounded capacity, induces structural governance failure along at least four of the six dimensions. The correlated and theory-requiring failures in legitimacy and non-domination present foundational challenges to existing models of democratic and republican governance. The analysis suggests that future theoretical work must address governance under conditions that violate cognitive comparability, developing new normative foundations for political legitimacy and contestatory practices. Practically, the findings reinforce stances that advocate for either very narrow delegation to highly interpretable systems or fundamental redesigns towards polycentric governance of AI, and pose significant skepticism towards direct delegation of binding authority to systems with radical capability asymmetry.

Conclusion

This framework constitutes a precise analytical tool for evaluating AI governance proposals under capability asymmetry. The findings underscore the indispensability of cognitive comparability for legitimizing and sustaining non-democratic authority. The spectrum of failures revealed—especially those requiring new normative theory—points to the limits of extant democratic, republican, and fiduciary concepts in addressing the governance of entities with superhuman cognitive capacity. The work is consequential for interdisciplinary debates on AI governance, as the constructive stress test delineates the conditions under which the delegation of authority to artificial agents becomes indefensible under current political theory (2604.02720).

Markdown Report Issue