SOTIF: Safety of Intended Functionality

Updated 7 February 2026

SOTIF is a framework that ensures automated systems operate safely by addressing hazards from functional insufficiencies and incorrect operational assumptions.
It employs scenario-based analysis and probabilistic risk quantification to identify triggering conditions and validate system performance in complex, real-world contexts.
Integrating simulation, Bayesian networks, and hybrid architectures, SOTIF enhances detection of unknown hazards and improves safety in automated driving.

Safety of Intended Functionality (SOTIF) is a formalized concept introduced by ISO 21448 targeting the mitigation of hazards emerging not from system malfunctions—addressed by classical functional safety standards such as ISO 26262—but from functional insufficiencies or incorrect operational assumptions in the absence of hardware/software faults. SOTIF’s primary focus is on ensuring that automated driving systems (ADS) present no unreasonable risk arising from performance limitations, incomplete specifications, or reasonably foreseeable misuse, even when all components are operating as intended (Patel et al., 4 Mar 2025). The framework mandates scenario-based identification, analysis, and validation of these hazards, particularly in complex, open-world contexts encountered by advanced driver assistance and fully automated systems.

1. Formal Framework and Core Definitions

ISO 21448 formally defines SOTIF as “the absence of unreasonable risk due to hazards resulting from functional insufficiencies of the intended functionality or its implementation” (Putze et al., 2023, Patel et al., 4 Mar 2025). Risk under SOTIF is decomposed into the probability of harm $P(H)$ , which is modeled as the joint probability of triggering conditions, hazardous behaviors, and exposure scenarios:

$P(H) \leq \sum_{T,B,E} P(T)\,P(B|T)\,P(E|B,T)\,P(H|E,B,T)$

where $T$ are triggering conditions (environmental or operational factors that activate a performance insufficiency), $B$ are hazardous system behaviors, $E$ are hazardous events, and $H$ denotes harm (Putze et al., 2023). Acceptance criteria are quantitative upper bounds (e.g., $P(H) \le R_H^{\text{max}}$ ), with validation targets converted to equivalent operational test hours or kilometers under Poisson statistical assumptions.

The SOTIF operational scenario space is partitioned into four areas:

Area 1: Known, Non-Hazardous
Area 2: Known, Hazardous
Area 3: Unknown, Hazardous
Area 4: Unknown, Non-Hazardous

SOTIF prescribes workflows to reduce risks in both Areas 2 and 3, systematically closing safety gaps and discovering unknown unsafe scenarios (Patel et al., 4 Mar 2025).

2. Hazard Categories: Specification and Performance Insufficiencies

SOTIF hazards emerge from two main sources (Patel et al., 4 Mar 2025, Fu et al., 2024):

Specification insufficiencies: Missing, ambiguous, or incomplete definition of intended functionality or operational design domain (ODD).
Performance insufficiencies: Inherent limitations in sensors, perception, prediction, actuation, or algorithmic modules, leading the system to unsafe behavior in certain scenarios even when no fault occurs.

Fu et al. provide a refined taxonomy of output insufficiencies in ADS, dividing them into:

World model insufficiencies (e.g., missed objects, incorrect localization)
Motion plan insufficiencies (e.g., unsafe or indeterminate trajectories)
Traffic rule insufficiencies (e.g., missed or misinterpreted traffic signals; rule violations)
ODD attribute insufficiencies (e.g., failed weather or road condition classification) (Fu et al., 2024)

The formal indicator for a functional insufficiency (FI) at a scenario $x$ is:

$\text{FI}(x) = \begin{cases} 1 & \text{if } d(f_{\text{act}}(x), f_{\text{spec}}(x)) > \epsilon \ 0 & \text{otherwise} \end{cases}$

with $d(\cdot, \cdot)$ a domain-appropriate distance or error metric, $f_{\text{act}}$ the system's output, $f_{\text{spec}}$ the specified/intended function, and $\epsilon$ the allowable bound (Fu et al., 2024).

3. Triggering Conditions: Identification and Modeling

A central concept in SOTIF is the "triggering condition" (TC): a scenario attribute or environmental factor that can activate a performance insufficiency, possibly leading to a hazard (Jiménez et al., 2023, Adee et al., 2023). TCs are systematically identified via:

Taxonomies derived from ISO 21448 annexes, PAS 1883, and structured scenario analyses (Jiménez et al., 2023)
Ontological modeling, such as scenario ontologies with nodes for weather, illumination, traffic density, and their operational constraints (Jiménez et al., 2023, Adee et al., 2023)
Bayesian network frameworks, capturing probabilistic dependencies between scenario factors (e.g., occlusion, weather, reflection) and perception performance metrics (e.g., false negative rate) (Adee et al., 2023, Adee et al., 2023)

Triggering conditions are instantiated as sets of parameterized constraints, with each $TC_i$ defined as a conjunction $\{c_j\}$ :

$c_j = (\text{param}, \text{op}, \text{value})$

(e.g., $(\text{visibility}, \leq, 500\ \text{m})$ ).

A systematic procedure iterates over function definition, ODD specification, TC listing, scenario construction, and hazard analysis, with qualitative and quantitative risk assessment for each (Jiménez et al., 2023, Jiménez et al., 2023).

4. Methodologies for Risk Quantification and Validation

SOTIF risk quantification leverages scenario-based testing (simulation and real-world), probabilistic modeling, and uncertainty analysis to establish that residual risk remains below defined acceptance criteria (Putze et al., 2023, Patel et al., 5 Mar 2025):

Quantitative Decomposition: Hierarchical modeling of $P(H)$ enables risk sub-targets for triggering conditions and system behaviors (Putze et al., 2023).
Simulation-Based Evaluation: Virtual test environments (e.g., CARLA) are populated by parameterized TCs to exhaustively assess detection system robustness under challenging environmental variants (e.g., heavy rain, fog, night) (Patel et al., 5 Mar 2025, Li et al., 2024).
Uncertainty Characterization: Composite frameworks (e.g., Dempster-Shafer Theory, Deep Ensembles entropy) quantify the impact and propagation of epistemic and aleatoric uncertainty in perception and decision pipelines (Patel et al., 3 Mar 2025, Peng et al., 2022).
Scenario Statistics and Coverage: Real driving data is abstracted into scenario parameter sets; empirical scenario coverage and risk are calculated using occurrence rates: $\hat p(S_i) = \frac{n_i}{N_{\rm total}}$ (Reichenbächer et al., 2024).
Hybrid and Redundant Architectures: Multi-channel safety architectures (e.g., Daruma pattern) dynamically select the channel least likely to incur a FI at run-time, using cross-channel risk minimization and arbitration (Fu et al., 2024).

5. SOTIF in Perception and AI-Enabled Systems

Complexity and inherent non-determinism in AI-enabled perception and planning requires advanced SOTIF error and risk models. Recent approaches include:

Temporal Error Modeling: The STEAM model extends the SOTIF cause-effect chain to include hazardous error sequences and temporal error patterns (e.g., consecutive false negatives in detection), and connects them to hazardous behavior patterns at the vehicle level (Czarnecki et al., 2023). Weakest precondition reasoning and temporal fault trees formalize the mapping from perception-level errors to high-level safety outcomes.
Risk-Aware Control: Real-time quantification of module uncertainty (e.g., via SOTIF entropy) is propagated to decision making and planning (e.g., uncertainty-inflated artificial potential fields and risk-aware MPC) to trigger graceful degradation or emergency maneuvers (Peng et al., 2022).
Large Vision-LLMs (LVLMs): These models demonstrate elevated semantic recall for long-tail, degraded scenarios, surpassing classical detectors in recall under SOTIF conditions but trading off geometric precision. Hybrid architectures fuse YOLO-like geometric detectors with LVLM-based semantic validators for high SOTIF resilience (Zhou et al., 30 Jan 2026).
Multimodal LLMs for Perception SOTIF: Fine-tuned multimodal LLMs using SOTIF-focused datasets (e.g., DriveSOTIF) capture rare trigger conditions and produce improved risk reasoning and recommended reactions, narrowing SOTIF perception gaps (Huang et al., 11 May 2025).

6. SOTIF Validation Campaigns and Toolchain Integration

Toolchain integration adapts scenario-based V&V suites to systematically enumerate and test TCs, with ontology-based scenario constraint packages extending OpenSCENARIO/ASAM ontologies (Jiménez et al., 2023, Reichenbächer et al., 2024). Workflows typically include:

Base scenario design and parametrization across ODD dimensions
Centralized scenario data management with ontological traceability
Automated test-case generation spanning pathological environmental and behavioral combinations
Execution and key performance indicator (KPI) monitoring (e.g., minimum safe distance, time-to-collision)
Systematic refinement of ODD and hazard thresholds in response to test outcomes

Adoption of these tools has resulted in measurable improvements, e.g., up to 35% increased hazardous-envelope test coverage and ∼20% additional unknown hazard discovery in simulation (Jiménez et al., 2023).

7. Open Challenges and Research Directions

Persisting challenges in SOTIF include:

Lack of unified evaluation metrics for risk and acceptance (no universal risk-norm or entropy measure) and difficulty in cross-lab comparison (Patel et al., 4 Mar 2025).
Scalability in TC enumeration—manual parametrization is inadequate for infinite scenario variety.
Integration of human factors, ethics, and driver-automation interaction into SOTIF arguments, particularly in handover and misuse scenarios (Patel et al., 5 Mar 2025, Patel et al., 4 Mar 2025).

Emerging research focuses on:

Machine learning–assisted scenario generation and TC discovery,
Uncertainty management and online risk estimation in operational and fielded ADS,
Continuous ODD boundary refinement and feedback from operational monitoring into the hazard database,
Template libraries and formal tools for temporal fault-tree construction,
ML-perception robustness to open-set and distributionally shifted conditions (Czarnecki et al., 2023, Huang et al., 11 May 2025).

SOTIF thus demands an interdisciplinary approach—integrating system engineering, probabilistic safety modeling, AI runtime assurance, and statistical scenario population modeling—to guard automated driving systems against the full spectrum of non-fault hazards across their operational life cycle.