- The paper introduces a Bayesian GEV framework that uses a non-linear link function to capture the complex dynamics of pedestrian crash risk.
- It employs advanced video analytics and block maxima EVT to process trajectory data and compute cycle-specific conflict metrics.
- The approach normalizes risk using a Modified Crash Risk (MRC) metric, significantly improving prediction accuracy in heterogeneous traffic conditions.
Trajectory-Based Real-Time Pedestrian Crash Prediction Under Heterogeneous Traffic: A Non-Linear Bayesian GEV Approach
Introduction
This study addresses a critical challenge in urban traffic safety: real-time estimation of pedestrian crash risk at signalized intersections under heterogeneous, non-lane-based traffic regimes. Conventional literature has often assumed linearity between model covariates and crash probability, leading to systematic mischaracterization of the highly non-monotonic and context-dependent risk dynamics prevalent in dense urban intersections—especially in developing contexts with mixed motorized and non-motorized traffic and routine pedestrian risk-taking behaviors. This work proposes a novel Bayesian hierarchical framework leveraging Extreme Value Theory (EVT) with Block Maxima (BM) for real-time crash risk estimation, specifically improving model expressivity by introducing a non-linear link function into the Generalized Extreme Value (GEV) formulation.
Methodological Innovations
Advanced Sensing and Trajectory Processing
The real-time analytics pipeline employs video-based detection and tracking via a convolutional-neural-network-based DEEGITS framework, using transfer learning with YOLOv8 and DeepSORT for robust multi-class tracking in high-occlusion, heterogeneous scenarios. The pipeline leverages Dhaka-specific training data and extensive augmentation for environment adaptation. Camera calibration with a 2nd-degree polynomial correction addresses perspective distortion, allowing accurate extraction of spatio-temporal trajectories for all road users.
Conflict Detection and Surrogate Safety Metrics
Conflict points are algorithmically identified via polyline intersections between pedestrian and vehicular trajectories. Post-Encroachment Time (PET), an established surrogate for crash risk, is computed cycle-wise, allowing aggregation at traffic signal phase granularity. The study adopts a PET < 5s threshold for conflict event selection, balancing strictness and data sufficiency for extreme value analysis.
Within each cycle (block), the most critical PET value is extracted. These maxima, across cycles and intersections, are modeled as GEV-distributed, supporting statistical inference on the distributional tail (i.e., the most dangerous events). Model parameters (location, scale, shape) are related to traffic and behavioral covariates through either linear or non-linear link functions. The non-linear link, parameterized by exponents θ, allows the relationship between covariates (e.g., flow, speed) and risk to be dynamically adaptive, capturing threshold effects and diminishing/increasing returns as functions of covariate magnitude or interaction.
Bayesian inference, with uninformative priors, is conducted using MCMC and Gibbs sampling, guaranteeing both uncertainty quantification and regularization in the high-dimensional, hierarchically structured parameter space. Goodness-of-fit and model parsimony are balanced via the Deviance Information Criterion (DIC).
Behavioral Adjustment: Modified Crash Risk (MRC)
A key conceptual advance is the Modified Crash Risk (MRC) metric. Unlike classical exceedance probabilities—which implicitly assume any "dangerous" event is equally abnormal—MRC normalizes crash risk at a given signal cycle by the background level of habitual risk (mean plus a tolerance, e.g., zcr​=1.45, corresponding to 93% confidence). This filters out context-specific, routine risk-taking (per Risk Homeostasis Theory), highlighting only signal cycles where pedestrian-vehicle interaction risk is abnormally above the local norm. This is essential in environments (e.g., Dhaka) where crossing behaviors and infrastructural constraints result in high baseline exposure that would otherwise confound generic methods.
Numerical Results and Empirical Insights
Model Comparison and Selection
Seven model variants are estimated: stationary, linear (location, scale, both), and non-linear (location only, both, etc.). DIC analysis robustly selects Model 3(a)—non-linear link for the location parameter only—as providing the best fit, reducing empirical mean error in five-year crash prediction markedly over linear or stationary alternatives. The use of non-linear links for both parameters (Model 4) resulted in worse convergence and inferior fit, possibly due to parameter identifiability issues.
Notably, the MRC method, coupled with the non-linear GEV specification, dramatically narrows the gap between predicted and observed crash counts (40.57 vs 36 observed over five years at nine sites), whereas generic crash risk estimation (without normalization) overestimates risk by more than an order of magnitude. This demonstrates explicit improvement over established approaches in analogous contexts.
Covariate Effects and Structural Interpretations
Model 3(a) identifies pedestrian speed as the most influential negative predictor—higher speeds reduce per-cycle risk, likely due to reduced time in conflict zones, with a significant exponent indicating a diminishing marginal protective effect. In contrast, motorized vehicle flow, pedestrian flow, motorized vehicle speed, and non-motorized vehicle conflicting speed are all positive predictors, but with non-linear sensitivity: the contribution to risk is substantial at low-to-moderate flow/speed and tapers at high density, reflecting congestion-induced caution or saturation effects.
Conflicting flow and speed of MVs can reduce risk, consistent with the safety-in-numbers and forced caution hypotheses, while high NMV conflicting speed contributes positively, more so at lower speeds/flows, indicating possible misjudgment of less-regulated road users.
Real-Time Assessment and Practical Use
The framework can generate GEV-based risk estimates cycle-by-cycle in real time, flagging specific cycles for intervention (e.g., adaptive pedestrian phasing, preemptive motorist signaling). By quantifying only those events that are statistically anomalous given local behavior patterns, the model supports targeted, efficient allocation of safety resources, rather than blanket interventions triggered by socialized baseline risk levels.
Implications and Future Directions
This research clarifies the necessity, both empirically and theoretically, to abandon simplistic linearity assumptions in EVT-based pedestrian risk modeling for mixed-traffic environments, adopting flexible, non-linear link structures. The findings challenge previous results relying on linear GEVs, demonstrating systematic underfitting in the presence of traffic heterogeneity and high baseline behavioral risk. The behavioral normalization via MRC is a critical innovation for the translation of conflict-based models to developing world cities with dense, mixed, informally regulated traffic.
From a methodological perspective, the combined use of hierarchical Bayesian inference, non-linear GEVs, and real-time sensing sets a new standard for urban surrogate safety analytics. Practically, the framework is directly applicable for dynamic signal control, pedestrian warning systems, and infrastructure prioritization.
Future work should seek to integrate socio-demographic stratification in behavioral adjustment (e.g., age, gender disaggregation), advance multi-modal conflict modeling (e.g., including bicyclists and micro-mobility), and explore deep learning representations for surrogate indicator extraction. Further, expanding the approach to unsignalized or partially controlled intersections would generalize its utility.
Conclusion
This study establishes a robust, flexible platform for real-time pedestrian crash risk estimation in heterogeneous, mixed-mode traffic environments, substantiating the critical value of non-linear link functions in GEV modeling and the necessity of behaviorally normalized risk metrics. The work has immediate implications for proactive traffic safety management, especially in high-risk, rapidly urbanizing contexts where traditional models are known to fail. The approach represents a methodological advance for surrogate-based traffic safety analysis.