Application Readiness Levels (ARLs)

Updated 15 January 2026

Application Readiness Levels (ARLs) are structured frameworks that assess technology maturity by defining progressive, domain-specific milestones.
They integrate multidimensional criteria—such as operational performance, regulatory compliance, and risk metrics—to ensure readiness from lab testing to full deployment.
ARLs enable transparent, evidence-based gap analysis and decision support in fields like automated driving, AI systems, cybersecurity, and quantum computing.

Application Readiness Levels (ARLs) are structured maturity frameworks designed to allow systematic, evidence-based assessment and comparison of complex technology applications as they progress from conceptualization through deployment and operation. Originating as an extension or specialization of NASA's Technology Readiness Levels (TRLs), ARLs have been tailored to address the unique operational, validation, and regulatory demands of fields such as automated driving, AI-based systems, software/cybersecurity, and quantum computing. ARLs serve as a common language, mapping nuanced domain-specific criteria onto a standardized progression of developmental stages to enable transparent, auditable, and actionable maturity evaluations across research, development, and deployment pipelines.

1. Conceptual Foundation and Motivations

Application Readiness Levels were introduced to resolve critical deficiencies in one-dimensional readiness frameworks (principally TRLs), especially their inability to capture application-specific validation stages, non-functional quality attributes, or regulatory compliance requirements. In domains such as automated driving, ARLs (or ADRLs) instantiate a nine-level, domain-adapted maturity model that replaces aviation- or general engineering-centric terminology with lifecycle stages transparently aligned with automotive engineering (software-in-the-loop [SiL], hardware-in-the-loop [HiL], vehicle-in-the-loop [ViL], on-road demonstration, certification, etc.) (Betz et al., 2024).

In AI-intensive defense and dual-use systems, ARLs are expanded not merely as developmental milestones but as a multidimensional construct wherein an application is assigned a level only if it satisfies thresholds across five critical axes: Alignment, Justified Confidence, Governance, Data Readiness, and Human Readiness (Browne et al., 15 Apr 2025). In the cybersecurity/software sector, ARLs are derived from methodologies such as SMART (Software Maturity Assessment and Readiness Technique), resolving the TRL shortcoming of omitting operational quality, security, or risk metrics by demanding evidence-backed binary assessments per dimension (kumari et al., 2022). In quantum information science, ARLs encapsulate the end-to-end feasibility, scalability, and practical utility of quantum algorithms against classical baselines, embedding system integration and architectural constraints into maturity labeling (Herrmann et al., 2023).

The central motivation across all implementations is to enable apples-to-apples maturity comparisons, transparent gap analysis ("white spot" identification), and regulatory or engineering decision support that is both fine-grained and directly actionable.

2. Canonical ARL Structures and Level Criteria

While all ARL frameworks maintain a staged character, the specific instantiations differ per field in count, granularity, and gating criteria. Several prominent ARL models are summarized in the following table, with additional explanatory detail after the table.

Domain	Levels	Key ARL Criteria and Examples
Automated Driving	9	SiL → HiL → ViL → Real vehicle with safety driver → Validation in ODD → Certification → Commercial use w/o safety driver (Betz et al., 2024)
Military/AI	9	Alignment spec → Data readiness → Lab validation → Governance → Human interface → Stress-testing → Operational pilot → Certification → Proven full deployment across all five readiness axes (Browne et al., 15 Apr 2025)
Software (SMART)	6	Idea → Research → Prototype → Pilot → Product → Outdated; with binary assessments in dimensions Security, Risk, Operations, Enhancement (kumari et al., 2022)
Quantum Utility	5	Concept → Proof-of-concept → Resource estimation/proof-of-scalability → Full-stack simulation → Demonstrated live utility vs classical baseline (Herrmann et al., 2023)

In all cases, progression to higher ARLs is contingent on meeting both functional and non-functional criteria, substantiated by artifacts (test logs, audits, user feedback, regulatory certifications) or, in the case of the AI framework, multidimensional quantitative thresholds.

Automated Driving: Application Readiness Levels

For automated driving systems, ARLs 1–9 directly parallel TRLs 1–9, but each substitutes flight/space development stages for automotive-relevant ones:

ARL 3: Software-in-the-loop testing in simulated ODD.
ARL 4: HiL—component validation in laboratory with real sensors/actuators, simulated environment.
ARL 6: Full-vehicle demonstration with safety driver in real-world ODD.
ARL 8: System certified for commercial operation in defined ODD (e.g., EU type approval).
ARL 9: Routine commercial deployment absent safety driver (Betz et al., 2024).

AI and Military Systems

The ARL schema enforces coverage of five AI-centric axes. Each level (1–9) requires satisfying all five axes to the degree appropriate for the stage, with ARL 9 demanding proof from operational data and continuous lifecycle compliance (Browne et al., 15 Apr 2025).

Software/Cybersecurity (SMART-derived ARLs)

Levels are shorter (six), with advancement predicated on positive binary evidence on assessment checklists for each major quality dimension. Levels are not traversed unless all relevant dimensions meet required states (e.g., no “–” for core categories after the prototype stage) (kumari et al., 2022).

Quantum Utility

Assessment proceeds from an initial hypothesis (ARL 1) through proof-of-concept, asymptotic scalability prediction, realistic simulation, and eventual empirical demonstration of practical advantage over a comparably resourced classical system (with clear reference to SWaP-C constraints) (Herrmann et al., 2023).

3. Mapping ARLs to TRLs and Other Taxonomies

In all models, ARLs map closely to, but extend, TRLs by overlaying domain-contextualized validation requirements and maturity gates. Notably:

In automated driving, ARL $n$ ≡ TRL $n$ , but automotive test environments (SiL/HiL/ViL/real vehicle) replace generic subsystem/stage labels. This ARL structure can be orthogonally mapped to SAE levels (driver responsibility) and ODD complexity (operational design domain) for richer multi-factor assessment (Betz et al., 2024).
In AI applications, ARLs add explicit requirements for dataset quality, explainability, continuous alignment auditing, governance, and operator readiness, thereby subsuming but also transcending the hardware/software system maturity focus of TRLs (Browne et al., 15 Apr 2025).
In software/cybersecurity, the SMART/ARL system introduces a two-dimensional assessment—separately scoring readiness progression and operational quality—mitigating the one-dimensionality of classic TRLs and providing a criterion-based audit chain suitable for regulatory compliance scenarios (kumari et al., 2022).
Quantum ARLs elaborate on the gap between algorithmic theoretical promise and practical utility, introducing resource-scaling, hardware-mapping, and utility measurement steps missing from standard TRL ladders (Herrmann et al., 2023).

4. Quantitative Assessment and Scoring Methodologies

ARL systems increasingly depend on quantitative, evidence-based, and multi-attribute decision gates.

In military AI, per-axis normalized subscores $c_i \in [0,1]$ (Alignment, Confidence, Governance, Data, Human) are aggregated by

$R = \sum_{i=1}^5 w_i \cdot c_i, \quad \sum_{i=1}^5 w_i = 1,$

and an overall readiness level is assigned only if both $R \geq R_n$ and each $c_i \geq c_\text{min}$ at that ARL stage. Example metrics: adversarial alignment violation rates, binomial error confidence, feature-space data coverage, operator-AI task success rates, governance checklist completion (Browne et al., 15 Apr 2025).

In SMART/ARL assessments, readiness at level $i$ is determined by the ratio $p_i = n_i^+/N_i$ (number of “yes” answers over all required checklist items), with thresholds $0 < T^0 < T^+ \leq 1$ distinguishing full ( $+$ ), partial ($0$), and failed ( $-$ ) category passage. Quality dimensions are each scored via weakest-link aggregation, and progression is blocked if any essential category is not satisfied after the prototype stage (kumari et al., 2022).
Quantum ARLs rely on direct resource and performance measurements, e.g., execution time, energy use, fidelity, against a classical baseline, often reporting composite utility metrics such as performance per watt per volume (GFLOP/Watt/Volume) but do not specify a unique formula (Herrmann et al., 2023).

5. Application-Specific Illustrative Case Studies

Concrete examples clarify ARL instantiation:

Automated Driving: A truck highway pilot at ARL 6 corresponds to public highway operation with a safety driver; the Waymo robotaxi in San Francisco qualifies as ARL 9, denoting full commercial operation in its geofence, 24/7, across all approved weather/traffic conditions (Betz et al., 2024).
Defense AI: For an autonomous drone vision model, progression from ARL 3 (lab accuracy $\approx$ 95%) to ARL 7 (operational field trial, zero alignment violations, governance board sign-off) demonstrates the cumulative AI readiness protocol, including data coverage, adversarial stress-testing, and human-factor integration (Browne et al., 15 Apr 2025).
SMART/Software: ARL 5 is denoted by verified production use with incident response, independently auditable quality data (e.g., SLA compliance, vulnerability disclosure); progression from ARL 3 (lab-validated prototype) to ARL 4 (pilot in real environment) requires all quality axes to achieve neutral ($0$) or positive scores (kumari et al., 2022).
Quantum Utility: As of the referenced analysis, no quantum application has reached ARL 5 (demonstrated live, superior to classical at comparable SWaP-C); the Variational Quantum Eigensolver stands at ARL 3 based on credible resource estimation predicting potential advantage at realistic scales, contingent on future device improvements (Herrmann et al., 2023).

6. Implications for Design, Regulation, and Comparative Analytics

Layering ARLs atop systems taxonomies (e.g., SAE level × ODD × ARL in automated driving) enables systematic identification of under-addressed technological “white spots” (e.g., no commercial urban German SAE 4 v2 systems), targeted research prioritization, and harmonized certification/regulatory strategies by providing both a semantic and an evidentiary bridge between R&D, deployment/application, and ongoing monitoring (Betz et al., 2024). In AI, ARLs ensure that cross-cutting concerns such as alignment, governance, and human factors are not relegated to post-hoc audits but incorporated as explicit readout at each maturity stage (Browne et al., 15 Apr 2025).

Within cybersecurity, the ARL/SMART method provides the unambiguous, auditable evidence trail necessary for compliance with mandates such as GDPR or the NIS Directive, with progression controlled by independent verification rather than self-attestation (kumari et al., 2022). In quantum technologies, adoption of ARLs promises to discipline claims of “quantum advantage” by demanding empirical, application-relevant demonstration of utility.

A plausible implication is that ARLs, when enforced rigorously, can shift industrial and academic focus from speculative maturity claims towards verifiable, competitive metrics that drive both trustworthy deployment and effective innovation targeting.