Eyelid Angle (ELA): 3D Biometric Analysis
- Eyelid Angle (ELA) is a quantitative 3D biometric metric characterizing eye openness by extracting geometric features from eyelid surfaces.
- ELA leverages fitted 3D plane normals to provide viewpoint invariance and lower variance compared to traditional 2D metrics like EAR.
- ELA supports robust blink detection and synthetic data augmentation via Blender, enhancing driver state monitoring and ADAS research.
The Eyelid Angle (ELA) is a quantitative biometric metric characterizing eye openness, defined via the 3D geometry of eyelid surfaces as extracted from facial landmarks. The ELA metric enables robust, viewpoint-invariant quantification of eyelid motion, supporting precise blink detection and drowsiness monitoring in driver state analysis. Unlike the widely used Eye Aspect Ratio (EAR), ELA leverages fitted planes from 3D landmark constellations on the upper and lower eyelids, providing lower variance under head pose changes. The ELA framework further facilitates synthetic data generation by animating avatars in Blender to follow prescribed ELA signals, augmenting empirical datasets for advanced driver assistance system (ADAS) research and development (Wolter et al., 24 Nov 2025).
1. Geometric Formulation and Computation
ELA computation begins by extracting 3D facial landmarks using MediaPipe Face Mesh V2, which provides for 468 keypoints, including seven ordered along each eyelid. Coordinates are normalized per frame: image coordinates by width and height, depth rescaled as .
To represent the upper and lower eyelids, 3D points for each eyelid are collected into matrices and ( each). Each set is zero-centered by subtracting its centroid, yielding . Singular value decomposition (SVD) of produces orthonormal bases, whose third column corresponds to the normal of the best-fit eyelid plane. Normal orientation is regularized using the sign of to ensure consistency.
The raw eyelid angle for an eye is then computed as:
where and are the upper and lower eyelid plane normals.
When aggregating information from both eyes, yaw angle from the face’s 3D pose is used to weight ELA values via a sigmoid visibility function , yielding:
This ensures the eye more directly facing the camera contributes more to the aggregate measure.
2. ELA versus Eye Aspect Ratio (EAR)
The Eye Aspect Ratio (EAR) defines openness as a 2D distance ratio:
where are landmarks localized around the eye. The EAR is sensitive to perspective and diminishes in reliability under head rotation due to foreshortening.
ELA, in contrast, is derived from the angular relationship of fitted 3D planes, rendering it invariant to rigid facial rotations. Synthetic evaluation with the eyelid held at fixed ELA () and the camera sweeping up to in vertical or horizontal axes yielded a raw ELA mean absolute error (MAE) of 2.8 (vertical) and 3.3 (horizontal), while EAR varied by more than 10–15% under the same transformations. Visualization depicts ELA as having a near-flat response across viewpoint shifts, while EAR fluctuates substantially (see Fig. “ELAvsEAR” in (Wolter et al., 24 Nov 2025)).
3. Blink Detection and Temporal Analysis
The ELA-driven blink detection framework comprises several signal processing and statistical stages:
- Post-processing: Raw ELA time series are filtered using a 1D Gaussian kernel ().
- Edge Detection: Calculate the temporal derivative , and apply k-means clustering to local extrema to separate “falling” (negative slope, ) from “rising” (positive slope, ) transitions.
- Blink Windowing: The relevant blink interval is defined using local maxima/minima before and after identified extrema; tangents at these points intersect with minima to define closing (), closed (), and reopening () durations.
- Feature Extraction: Table 1 in (Wolter et al., 24 Nov 2025) details the computed blink features:
| Temporal Feature | Mathematical Expression | Description |
|---|---|---|
| Closing duration () | Time to close eyelid | |
| Closed duration () | Time eyelid remains closed | |
| Reopening duration () | Time to reopen eyelid | |
| Amplitude | Openness delta | |
| A/V ratio | Amplitude/velocity | |
| Normalized area | — | Area below reopening curve |
| PERCLOS | — | Percent ELA between blinks |
| Inter-blink interval | — | Time between consecutive blinks |
Blink detection employs rules to avoid merged/masked events, running analyses in 90 s windows updated every 60 s.
For drowsiness inference, a 10-NN classifier is trained on the means and standard deviations of blink features. PCA (5 components) serves as input for predicting alert versus drowsy states. ELA-derived features replicate classic findings that drowsiness is marked by increased closing (), closed (), and slower reopening durations [(Wolter et al., 24 Nov 2025), Fig. 5].
4. Synthetic Data Augmentation via Blender
ELA’s geometric definition enables automated eye animation for data augmentation. The methodology consists of:
- Blink Signal Synthesis: Blink durations () and inter-blink intervals () are drawn from empirical distributions (Caffier et al., 2003): ms, ms, ms, for alert; drowsy blinks are longer.
- Avatar Animation: A rigged Blender avatar (controlled by shape keys for eyelids) follows the constructed ELA waveform, interpolated using splines for physically plausible motion.
- Camera and Lighting Randomization: The virtual camera’s yaw and pitch are jittered according to normal distributions (), with FOV and lighting varied as .
- Noise Augmentation: Gaussian noise is added to the ELA trajectory to emulate landmark jitter.
- Benchmarking: Constant ELA ground truths (0–70) are used to sweep orientation and evaluate geometric error, revealing absolute ELA mean errors of 4–7 for large angles and up to 18.3 at 0 (fully closed).
This pipeline provides scalable, controlled datasets for training and benchmarking drowsiness classifiers under varied conditions.
5. Experimental Evaluation in Driver Monitoring
Key empirical results on public datasets:
- ELA versus EAR Stability: Over view sweeps, EAR variance approaches 30%, while ELA’s maximum error remains (MAE ).
- Accuracy by Angle: At set ELA ground truths (0–70), the MAE is highest at closed eyes (18.3 at 0) and falls to 4–7 for open eyes (50–70).
- Blink Detection on DMD: Across 16 videos (5441 ± 183 frames each; 1578 labeled blinks), ELA-based detection achieved an accuracy (DA) of 89.4%.
- Drowsiness Classification: On UTA-RLDD, multiclass (alert/low/drowsy) video-level accuracy was 52.5% (baseline with all features: 65.2%), with binary (alert vs. drowsy) accuracy at 80.4%.
- Synthetic Data Impact: Training/testing on matched FPS (10, 30, 50 Hz) resulted in AC1 accuracies of 77%, 98%, and 92%, respectively; accuracy dropped to 69% or 46% on cross-rate data. Blink detection on synthetic data showed 51% DA at 10 Hz, improving to 95% at 30/50 Hz.
These results demonstrate ELA’s reproducibility, viewpoint invariance, and discriminatory power for eye openness and blink analytics. The metric also enables the creation and validation of large-scale synthetic datasets with precise parametric control for driver state monitoring.
6. Implementation Considerations and Availability
ELA code, Blender scenes, and dataset-generation scripts will be released as open-source resources contingent on paper acceptance, fostering reproducible research in driver monitoring and ADAS evaluation (Wolter et al., 24 Nov 2025). The inclusion of procedural camera and lighting diversifications, combined with physiologically plausible blink kinematics, supports generalization across real-world deployment contexts.
7. Significance and Research Directions
ELA addresses the limitations of 2D ocular metrics under variable camera viewpoints, supplying a stable, geometric foundation for both statistical learning and physically grounded simulation. Its integration with Blender facilitates targeted data augmentation, contributing to more robust and diverse datasets in driver monitoring research. A plausible implication is that ELA may underpin further advancements in explainable computer vision for human state analysis, especially where 3D information from monocular video can be reliably estimated or simulated. The framework also underscores the importance of aligning biometric signals and synthetic augmentation strategies with task-specific invariances, particularly for safety-critical ADAS deployments (Wolter et al., 24 Nov 2025).