Papers
Topics
Authors
Recent
Search
2000 character limit reached

Gaze-Contingent Display Technologies

Updated 24 January 2026
  • Gaze-contingent displays are interactive visual systems that adapt content in real time based on the viewer’s eye position and retinal acuity.
  • They dynamically optimize rendering resolution and power consumption using predictive saccade detection and perceptually driven algorithms.
  • Applications range from VR/AR to aviation and holography, addressing challenges like geometric distortions and latency with advanced calibration techniques.

Gaze-contingent displays are interactive visual systems whose output adapts dynamically according to the real-time orientation and position of the viewer’s gaze. By leveraging human ocular physiology—specifically, the spatial and temporal non-uniformity in visual acuity and perception—these displays optimize graphical content, minimize computational load, and enable advanced user interaction. Recent advances in event-based eye tracking, gaze-aware rendering pipelines, and perceptually driven algorithms have broadened the scope and performance of gaze-contingent technologies across virtual reality, augmented reality, holography, transparent displays, and demanding avionics environments.

1. Principles of Gaze-Contingent Display Architectures

Gaze-contingent displays use measurements from eye-tracking hardware to modulate the spatial, chromatic, and temporal characteristics of the displayed imagery. Central to these systems is the notion of a “point of gaze,” typically mapped to a user’s foveal fixation, with peripheral regions rendered at reduced fidelity in accordance with the human visual system’s (HVS) acuity falloff and perceptual thresholds.

State-of-the-art gaze-tracking employs high-speed sensors, including asynchronous event cameras (DAVIS346), delivering update rates beyond 10 kHz, with microsecond-level latency and a dynamic range of ≈130 dB (Angelopoulos et al., 2020). Model-based 2D pupil fitting and polynomial regressors (e.g., 2nd–5th order bivariate polynomials) map parametric pupil features directly to screen coordinates, yielding typical gaze accuracy of 0.45° at 45° FoV and 1.75° at 98° FoV.

Key functional modules include:

  • Rapid fixation and saccade detection via velocity and dispersion-based algorithms (0708.3505, Arabadzhiyska et al., 2022).
  • Foveated and predictive rendering: real-time reallocation of spatial detail to the fovea, with temporal prediction of saccade landing points to counteract system latency (Arabadzhiyska et al., 2022).
  • Gaze-contingent color and intensity control, guided by psychophysically derived discrimination models and hardware-based power optimizations (Duinkharjav et al., 2022).
  • Ray-based neural transformations for distortion correction in wide-FoV near-eye displays (Hiroi et al., 2022).

2. Perceptually-Driven Rendering and Power Optimization

Gaze-contingent methods exploit detailed models of foveal and peripheral sensitivity. Psychophysical measurements establish thresholds for color and spatial discrimination, which grow with retinal eccentricity—threshold ellipse area in DKL color space increases 4–5× from 10° to 35° eccentricity (Duinkharjav et al., 2022). Rendering algorithms use these JND boundaries to constrain per-pixel shifts, ensuring perceptual invisibility while minimizing display power.

The display power model for OLEDs is linear in sRGB color channels:

P(Xs)=PsXs+P0P(\mathbf{X}_s) = \mathbf{P}_s \cdot \mathbf{X}_s + P_0

where Ps\mathbf{P}_s is a channel-wise slope vector derived by regression, and P0P_0 is the static (black) power (Duinkharjav et al., 2022). A constrained optimization is solved per pixel:

minimize PsXs(x) s.t. ϵ(x;t,b,a)=0\text{minimize}~\mathbf{P}_s \cdot \mathbf{X}_s(\mathbf{x}) ~ \text{s.t.}~ \epsilon(\mathbf{x};\mathbf{t},\mathbf{b},\mathbf{a}) = 0

where ϵ\epsilon defines the JND ellipse constraint and x\mathbf{x} are DKL coordinates.

Artifacts introduced by naive power saving (e.g., uniform luminance scaling) can be substantially mitigated: gaze-contingent chromaticity modulation achieved up to 24% power reduction with only 16.7% perceptible artifacts (vs. 63.5% for naive scaling) (Duinkharjav et al., 2022).

Saccade-contingent rendering leverages the post-saccadic dip in foveal acuity (≈10 cycles/deg at landing, rising to ≈27 cpd over 500 ms), temporally adapting resolution and bandwidth for significant compute and power savings, up to 80% at 90 ppd in modern HMDs (Kwak et al., 2024).

3. Calibration and Correction of Display Distortions

Wide-FoV near-eye displays suffer from spatially and gaze-dependent geometric distortions. Explicit geometric models struggle with complex, nonlinear mapping under gaze variation. Neural Distortion Fields (NDF) use fully connected deep networks to learn a gaze-contingent mapping from spatial position and gaze direction to perceived pixel coordinates (Hiroi et al., 2022). Querying along 3D gaze rays and volumetric integration yields display coordinate corrections with median errors as low as ≈3.23 px (5.8 arcmin) using only 8–125 training viewpoints.

NDFs outperform conventional polynomial fitting, particularly near the center of the FoV and at off-center gaze positions, enabling distortion-free images that minimize VR sickness and maintain perceptual realism (Hiroi et al., 2022).

4. Depth, Motion Parallax, and Physical–Virtual Alignment

Gaze-contingent rendering improves perceptual realism and depth cues in VR and AR by correcting for “ocular parallax”—the depth-dependent retinal shift arising because the center of rotation and center of projection in the eye are not coincident (Konrad et al., 2019). For an object at distance DD from the center of rotation CC, rotated by θ\theta, the retinal shift is approximately:

Δx(θ)NCDθ\Delta x(\theta) \approx \frac{NC}{D} \theta

where NCNC is the nodal–center offset (7.7\approx7.7 mm) (Konrad et al., 2019).

Perceptual experiments demonstrate that correct ocular parallax rendering distinctly improves ordinal depth discrimination (chance-correct performance rises from ≈50% to up to ≈76% with parallax enabled) and perceptual realism (≈77% preference for parallax over conventional rendering) (Konrad et al., 2019).

Nonetheless, many static perspective distortions decay with increasing optical display distance (≥1 m in typical HMDs); the “gaze-contingent disparity” artifact persists in AR when virtual and real objects must align within arm’s reach, leading to misregistration up to ≈0.6 cm if not compensated (Linton, 2019).

5. Gaze-Contingent Interaction, Selection, and Control

Gaze-contingent selection protocols reduce manual effort and cognitive load in environments where hands are otherwise occupied, such as military aviation (Murthy et al., 2020). Event-triggered pointer mapping using neural networks can achieve <2 s selection at 2–3° targets on moving platforms, despite ±1–5 G and vibration. Adaptive nearest-neighbor selection halves movement time compared to non-adaptive dwell/click, and multimodal fusion (head plus eye gaze) attains ≈32% lower selection latency and ≈58% higher throughput over joystick-based systems (Murthy et al., 2020).

Failure modes include limited vertical FoV, high external illumination causing IR saturation, and loss of tracking under off-axis gaze. Mitigation encompasses hardware with ≥60° vertical FoV, dynamic illumination adaptation, and algorithmic fallback to visible-light webcams with ML-based gaze estimation (Murthy et al., 2020).

Fitts’ law throughput serves as a benchmark, with head-mounted gaze systems approaching >0.60 bits/s in high-fidelity environments (Murthy et al., 2020).

6. Event-Based and Predictive Saccade Processing

Event-based gaze tracking enables ultra-low-latency foveated and predictive rendering by asynchronously processing scene changes only where and when contrast shifts occur (Angelopoulos et al., 2020). Real-time per-event pupil model updates and incremental polynomial regressors yield end-to-end latencies <2–4 ms, order-of-magnitude lower power consumption than conventional sensors, and robust operation amid saccades and blinks.

Predictive algorithms use velocity symmetry, parametric modeling, or neural approaches to forecast saccade landing points 10–15 ms into a saccade, reducing “pop” artifacts under display latency budgets ≤50–70 ms (Arabadzhiyska et al., 2022). Efficient correction for saccade orientation in 3D and smooth-pursuit interactions is achieved through low-dimensional temporal shearing transforms rather than retraining large models (Arabadzhiyska et al., 2022).

7. Applications, Limitations, and Design Implications

Gaze-contingent displays have found application in:

Design guidelines established include:

Limitations persist in ground-truth accuracy measurement, real-world robustness under environmental variability, and computational resource constraints for high-resolution or embedded implementations (Angelopoulos et al., 2020, Murthy et al., 2020, Hiroi et al., 2022).


Gaze-contingent displays represent an integration of high-speed eye tracking, real-time perceptual modeling, predictive computation, and adaptive rendering. The field continues to evolve with advances in sensor hardware, neural calibration, perceptual optimization, and broader application in immersive interfaces, AR/VR, and mission-critical display systems.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Gaze-Contingent Displays.