Extreme Value Theory in PET Analysis
- Extreme Value Theory for PET is a framework that analyzes near-miss conflicts by modeling the tail behavior of negative PET data.
- It employs methods like the peaks-over-threshold approach with GPD, declustering, and Markov chain modeling to address temporal dependence.
- Empirical findings, including LPI intervention studies, show that EVT models can quantify safety improvements by increasing PET values and reducing near-miss severity.
Post-encroachment time (PET) is a quantitative surrogate for pedestrian–vehicle conflict, defined as the time difference between a vehicle and a pedestrian successively traversing the same spatial point. Low PET values indicate near-miss scenarios and, in the limit, actual collisions. Because collision events are rare and PET near-misses provide a richer sample of hazardous interactions, statistical models that accurately capture PET extremes enable rapid evaluation of traffic-safety interventions. @@@@1@@@@ (EVT) offers a rigorous framework for analyzing the distributional tail behavior of PET and for predicting future extreme conflict events, supporting both inference and proactive decision-making in road safety applications.
1. Mathematical Foundations of PET and EVT
PET is formally defined as , where , are the times of a vehicle and pedestrian crossing a common spatial location. Analytical focus centers on small PET values, which encode near-miss severity. To align with the conventions of EVT—which handles upper-tail behavior—the PET series is negated: . Thus, large are tantamount to hazardous events for analysis.
EVT provides two primary schemes: the block maxima approach, fitting Generalized Extreme Value (GEV) distributions to maxima within blocks (e.g., daily or weekly), and the peaks-over-threshold (POT) method, fitting the Generalized Pareto Distribution (GPD) to exceedances above a high threshold. In traffic safety, POT is preferred due to denser tails and greater statistical efficiency—block maxima often wastes observations given the rarity of actual collisions (Hewett et al., 2023).
2. Statistical Modeling of PET Extremes
Given PET data and threshold , POT considers the exceedances for . EVT assures that for sufficiently high , the conditional distribution of is approximated by , where is the shape (tail-index) and the scale. If , the tail is light and bounded, reflecting finite maximal near-miss severity; signals heavy, Pareto-type tails; recovers the exponential case (Padoan et al., 6 Apr 2025).
Threshold selection is guided by mean residual life (MRL) plots, seeking approximately linear excess means above candidate , and by parameter stability plots for . Diagnostic tools include probability-probability (P–P), quantile-quantile (Q–Q) plots, and goodness-of-fit tests (e.g., Anderson–Darling) on residuals.
3. Incorporating Temporal Dependence and Covariate Effects
PET time series frequently express temporal clustering—especially during rush hours—and correlation persists between successive extremes. Standard POT assumes independence, but this is indefensible for raw PET time series.
Two remedies are employed:
- Declustering uses an auxiliary run-parameter (e.g., 10 consecutive non-exceedances) to isolate clusters presumed independent, retaining only peak exceedances. This approach is highly sensitive to and sacrifices substantial data.
- First-order extreme-value Markov chain modeling retains all exceedances and employs bivariate copulas (e.g., logistic family) with unit-Fréchet margins to encode extremal dependence, parameterized by . Extremal index estimated via quantifies cluster structure, with representing average cluster size (Hewett et al., 2023).
Covariate effects, including interventions (e.g., the binary before/after indicator for LPI), enter by allowing scale parameter variation: . Negative values (on the scale) directly evidence increased PET, signifying fewer near-misses.
4. Parameter Estimation and Predictive Inference
Parameter estimation proceeds via frequentist maximum likelihood or Bayesian MCMC. The GPD log-likelihood,
is maximized numerically for MLEs . Bayesian inference employs priors and samples from the posterior predictive density
typically via Monte Carlo.
Return levels (e.g., the “once-in-100-years” PET) satisfy
with asymptotic validity guaranteed by standard second-order POT regularity conditions (Padoan et al., 6 Apr 2025). Both frequentist and Bayesian interval estimators attain correct coverage rates for future extreme PETs in large samples.
5. Practical Implementation in Road Safety Studies
In applied PET analysis, the workflow incorporates:
- Automated video tracking of pedestrian–vehicle trajectories to harvest minute-resolved PET time series.
- Extraction of minimum PET values within fixed time windows (e.g., 10-minute bins), yielding dense conflict event sampling.
- Threshold selection via MRL and parameter-stability diagnostics.
- Model fitting: GPD (marginal-only), declustered POT, or first-order Markov chain POT under a Bayesian or frequentist framework.
- Goodness-of-fit assessment via posterior predictive checks, QQ/PP plots, and sensitivity analysis to or prior scales.
When PET exhibits AR(1)-type dependence, residuals post-AR(1) filtering are submitted to POT fitting and predictive inference is recast as
where (Padoan et al., 6 Apr 2025).
6. Inference on LPI Safety Intervention: Empirical Findings
The LPI (“leading pedestrian interval”) treatment, granting 5 s exclusive pedestrian right-of-way, was evaluated at 15 intersections using before/after PET conflict data. Markov chain EVT modeling exhibited:
- At 12/15 treated sites, posterior 95% credible intervals for (change in extreme scale post-treatment) lay wholly below zero, indicating statistically significant reductions in tail scale of (i.e., increased PET, fewer near-misses).
- Untreated control sites showed credible intervals for strongly containing zero, evidencing no systematic shift.
- Representative results: Site 1 (treated) (CI []), (CI []), (); Site 2 (control) (CI []).
- The LPI intervention increased PET by approximately s at the extreme tail, i.e., half a second greater separation in severe near-miss events (Hewett et al., 2023).
A negative confirms a bounded worst-case severity for PET near-misses. Estimated extremal indices reflect moderate clustering, with clusters of roughly 1.1 extremes on average.
7. Extensions, Diagnostic Tools, and Future Research
Future directions include dynamic threshold selection based on traffic volume or time-of-day, spatial dependence modeling across intersections via max-stable processes, and refined estimation of extremal indices for nonstationary PET series. Threshold stability allows extrapolation: GPD parameters estimated at intermediate apply to higher , with scale updating .
Practical recommendations mandate MRL-guided thresholding, robust fitting (MLE, Bayesian), reporting of return levels/periods for risk assessment, and filtering for temporal dependence before extreme modeling. Diagnostic plots and tests remain essential for model validation.
These EVT methodologies enable the proactive, evidence-based targeting of traffic-safety interventions using near-miss data rather than waiting for collision thresholds, providing rapid and statistically sound metrics for policy makers and researchers (Hewett et al., 2023, Padoan et al., 6 Apr 2025).