Eyes Can't Always Tell: Fusing Eye Tracking and User Priors for User Modeling under AI Advice Conditions

Published 2 Apr 2026 in cs.HC | (2604.01741v1)

Abstract: Modeling users' cognitive states (e.g., cognitive load and decision confidence) is essential for building adaptive AI in high-stakes decision-making. While eye tracking provides non-invasive behavioral signals correlated with cognitive effort, prior work has not systematically examined how AI assistance contexts, specifically varying advice reliability and user heterogeneity, can alter the mapping between gaze signals and cognitive states. We conducted a within-subject lab eye-tracking study (N=54) on factual verification tasks under three conditions: No-AI, Correct-AI advice, and Incorrect-AI advice. We analyze condition-dependent changes in self-reports and eye-tracking patterns and evaluate the robustness of eye-tracking-based user modeling. Results show that AI advice increases decision confidence compared to No-AI, while Correct-AI is associated with lower perceived cognitive load and more efficient gaze behavior. Crucially, predictive modeling is context-sensitive: the relationship between eye-tracking signals and cognitive states shifts across AI conditions. Finally, fusing eye-tracking features with user priors (demographics, AI literacy/experience, and propensity to trust technology) improves cross-participant generalization. These findings support condition-aware and personalized user modeling for cognitively aligned adaptive AI systems.

Abstract PDF Upgrade to Chat

Authors (8)

Summary

The paper introduces a fusion approach that combines eye tracking with user priors to enhance user modeling in AI-assisted decision making.
It demonstrates that integrating dynamic gaze metrics with stable user traits significantly improves prediction robustness across different AI advice conditions.
Findings highlight that adaptive user models, which consider both behavior and prior knowledge, can better balance cognitive load and decision confidence.

Condition-Aware Fusion of Eye Tracking and User Priors in AI-Assisted Decision Making

Introduction

"Eyes Can't Always Tell: Fusing Eye Tracking and User Priors for User Modeling under AI Advice Conditions" (2604.01741) primarily addresses the limitations of gaze-based inference of user cognitive states within the context of AI-assisted factual verification tasks. The authors systematically investigate how both the presence and correctness of AI advice modulate the relationship between eye-tracking signals and self-reported cognitive states—namely, cognitive load and decision confidence—as well as objective decision accuracy. Crucially, the paper demonstrates the necessity of integrating stable user priors (demographics, AI literacy, and propensity to trust technology) with dynamic interaction signals to achieve robust cross-participant generalization, especially under variable AI reliability.

Figure 1: Three-step experimental workflow collects eye-tracking and self-report data, extracts trial-level behavioral/physiological and participant priors, and evaluates prediction models conditioned on AI advice.

Empirical Study Design and Eye Tracking Analysis

The study employs a within-subject lab protocol on factual verification, manipulating AI assistance across three conditions: baseline (No-AI), Correct-AI, and Incorrect-AI advice. Each participant completes 12 trials, encountering counterbalanced sequences of true/false claims with or without AI assistance. Eye-tracking is performed at 60 Hz using AOIs for evidence/context, AI advice, and user rating panels. Self-reported cognitive load and confidence (Likert scale), together with manipulation checks, offer ground-truth for post-trial modeling. Demographic information, AI literacy/experience, and trust propensity are acquired via pre-study survey.

Figure 2: Gaze heatmaps show denser visual attention in No-AI and Incorrect-AI trials, with more distributed fixations across evidence and response panels.

Condition-Sensitive Effects on Cognitive States and Gaze Patterns

Mixed effects modeling and repeated measures ANOVA reveal strong, statistically significant modulation of cognitive states by AI advice:

Cognitive Load: Correct-AI advice yields lowest cognitive load (mean=3.18) versus No-AI (3.56, $p=0.010$ ) and Incorrect-AI (3.38, $p=0.040$ ).
Decision Confidence: Both AI conditions increase confidence over No-AI, with Correct-AI showing highest values (No-AI=5.22, Correct-AI=5.93, $p<0.001$ ).
Decision Accuracy: No significant main effect, indicating cognitive states can shift independently of objective correctness.

Gaze metrics (fixations, saccades, pupil diameter, TTFF) demonstrate that the presence and correctness of AI advice directly influence visual processing:

Figure 3: Mean fixation/saccade count, pupil diameter, and TTFF across AOIs, showing significant context- and advice-dependent differences.

No-AI: Longer fixations, more saccades, larger pupil diameter on evidence/context reflect elevated cognitive effort and uncertainty.
Correct-AI: Reduced fixational metrics and faster orientation suggest facilitation; participants efficiently process advice.
Incorrect-AI: Increased fixations on context, delayed rating region focus indicate compensatory verification effort under misleading advice.

Robustness of Eye-Tracking-Based Predictive Modeling

Prediction tasks are framed as trial-level classification (high vs. low cognitive load/confidence/accuracy), using leave-one-subject-out cross-validation. Various ML models (LR, SVM, RF, ET, AdaBoost, XGB, MLP) are evaluated with feature sets: gaze-only, priors-only, and multimodal fusion.

Key numerical results:

Eye-tracking signals alone robustly predict decision accuracy (All models except LR: accuracy $\sim$ 0.79), though optimal performance is often achieved in condition-specific rather than pooled models.
Self-reported cognitive load and confidence are less reliably decoded from gaze alone (mean accuracy $\sim$ 0.66), showing marked instability across AI conditions.
Multimodal fusion (gaze+priors) consistently outperforms both uni-modal approaches, delivering peak performance especially where gaze-signal mappings are disrupted by misleading AI.

Feature Importance and Mechanistic Interpretation

SHAP analysis identifies the dominant predictors for each target variable under Correct-AI and Incorrect-AI:

Figure 4: SHAP attribution scores highlighting top-10 features by predictive value across AI reliability, with gaze metrics dominating Correct-AI; user priors (AI experience, demographics) rise in significance under Incorrect-AI.

Cognitive Load: Under Correct-AI, gaze-derived features predominate; Incorrect-AI elevates user priors, notably AI experience.
Confidence: AOI-Advice features and rating region attention are more impactful; under Incorrect-AI, user priors increase explanatory power.
Accuracy: Context AOI features gain prominence with misleading advice, supporting the strategy-shifting hypothesis.

These findings corroborate the theoretical premise that AI reliability fundamentally alters both cognitive processes and their behavioral correlates.

Practical and Theoretical Implications

Adaptive User Modeling: The demonstrated conditional heterogeneity implies that robust inference architectures must integrate explicit AI condition features, rather than relying on pooled models or fixed gaze-to-cognition mappings. Mixture-of-experts or condition-specific heads can mitigate the degradation in prediction performance under shifting advice reliability.

Personalization and Cold-Start Robustness: Incorporating user priors addresses the cold-start problem by anchoring intra-individual variability, facilitating transfer and generalization. Practical deployment is enabled by lightweight surveys or behavioral proxies.

Human-AI Interaction Design: The results motivate cognitively-aligned AI experiences: adaptive systems should jointly sense dynamic behavioral signals (eye tracking), encode context (AI advice, correctness), and personalize responses via stable traits (literacy, trust propensity). This is consistent with recent calls for responsible and explainable AI that calibrate overreliance and support appropriate trust [explanation_ai_overreliance, AI_assisted_decision_making].

Future developments may extend to richer advice formats, variable explanation strategies, and multi-modal physiological sensing, offering further granularity in moment-to-moment adaptation.

Conclusion

This study demonstrates that gaze-based sensing for cognitive user modeling in AI-assisted decision scenarios is strongly condition-sensitive; the same eye-tracking patterns signal different cognitive states depending on AI reliability. Fusing gaze features with stable user priors substantially improves predictive robustness and cross-participant generalization. Practical implication: adaptive systems should treat AI condition and user characteristics as integral to cognitive-state inference, enabling more effective, personalized, and trustworthy AI-assisted decision making.

Markdown Report Issue