User-Click Modeling

Updated 2 February 2026

User-click modeling is a framework that quantifies user interactions using click data and combines probabilistic, neural, and graph-based techniques for ranking and personalization.
It categorizes models based on key-design choices—global dependencies, sequentiality, and factorization—to address biases, context, and UI-specific behaviors.
Practical applications include debiasing ranking systems, optimizing mobile UIs, and simulating clicks, with performance validated by metrics like CTR, NDCG, and perplexity.

User-click modeling is the quantitative and algorithmic framework for capturing and predicting how users interact with items—typically via clicks or taps—across a wide variety of web and application interfaces. It encompasses both classical probabilistic graphical models (PGMs), advanced neural architectures, and large-scale statistical systems that use click data as implicit feedback for tasks such as ranking, personalization, and UI optimization. Research in this area establishes the mathematical dependencies governing click generation, accounts for position, bias, context, and user intent, and provides the statistical machinery to fit, evaluate, and deploy such models in realistic settings.

1. Fundamental Principles and Model Taxonomy

Click modeling's central object is the conditional probability distribution

$P(C | \text{observables})$

where $C$ denotes the array of click variables over available items and "observables" may include item features, positions, topics (carousels or blocks), shown context, and preceding interactions. "Rethinking Click Models in Light of Carousel Interfaces" formalizes that every click model can be unambiguously described through three Key-Design Choices (KDCs) (Kang et al., 23 Jun 2025):

KDC 1 (Global Dependencies): Which observable variables $X \subseteq \{\text{topics } T,\, \text{items } Y,\, \text{other clicks } C'\}$ are allowed to influence each $C_{i,j}$ ?

E.g., $X = \{Y\}$ gives item-only models; $X = \{Y, C'\}$ (classic cascade models) allow for click dependencies.

KDC 2 (Sequentiality): For each $C_{i,j}$ , which subset of $X$ (e.g., previous positions, clicks) is actually conditioned upon? This choice determines whether the model encodes sequential scan, grid, or unrestricted dependencies.

KDC 3 (Factorization): How does the model decompose $P(C_{i,j}|X)$ —is it a (multiplicative) product of subfunctions, a neural network, or a direct table? Multiplicative factorizations (e.g. "examination × attractiveness") underpin the interpretability of position-based and related models.

Based on these choices, click models can be categorized (see table below):

Model Class	Global Dependency $X$	Interface(s)
Random	$\varnothing$	All
Items-Only	$\{Y\}$	All
Clicks-Only	$\{C'\}$	All
Items-Clicks	$\{Y, C'\}$	All
Topics-Only	$\{T\}$	Carousels
Topics-Items	$\{T, Y\}$	Carousels
Topics-Clicks	$\{T, C'\}$	Carousels
Fully-Dependent	$\{T,Y,C'\}$	Carousels

This taxonomy is exclusive and stable: every prior single-list, grid, and carousel model, whether PGM or NN-based, falls into exactly one cell (Kang et al., 23 Jun 2025).

2. Classical Probabilistic Click Models

Early user-click models in web search (e.g., Position-Based Model (PBM), Cascade Model (CM), Dynamic Bayesian Network (DBN)) are all instantiations of the general framework above:

PBM: $P(C_i=1|i, Y_i) = v_i \alpha_{Y_i}$ — the likelihood of a click at rank $i$ is the product of examination propensity $v_i$ and item attractiveness $\alpha_{Y_i}$ .
Cascade Model: Introduces sequential scan and stopping upon click by conditioning examination $E_i$ on non-abandonment up to $i-1$ :

$P(E_i=1|C_{1:i-1}) = \prod_{j=1}^{i-1} (1 - C_j)$

$P(C_i=1) = P(E_i=1) \cdot \alpha_{Y_i}$ .

User Browsing Model (UBM): Examination depends on position and the last clicked position, enabling skip and revisit patterns.

Unified estimation for these models is provided by generalized frameworks such as the Generalized Cascade Model (GCM) (Ruijt et al., 2021), which shows that all such models are input–output Hidden Markov Models (IO-HMMs) and can be fit via EM using standard forward–backward algorithms, as implemented in the gecasmo library.

Grid and two-dimensional layouts (e.g., image search) require further generalization. The Grid-Based User Browsing Model (GUBM) (Xie et al., 2018) linearizes 2D scan paths between observed hovers/clicks, modeling examination and attractiveness for each grid cell, and learning path-biases directly from logs via unsupervised EM.

3. Neural and Graph-Augmented Click Models

Classical PGMs impose hard-wired dependencies but cannot express higher-order, non-local, or context-dependent behaviors. Neural click models extend this expressiveness:

Neural Click Model (NCM): Processes slates or grids with RNNs or Transformers, either autoregressively (e.g., session/position encoding with GRU (Shirokikh et al., 2024)) or in fully contextualized (self-attention) form. This allows skip, cascade, and cross-item dependencies to be learned directly from data.
Adversarial and attention-based extensions: E.g., adversarial training between generator and discriminator click models for higher-fidelity simulation (Shirokikh et al., 2024).
Graph-enhanced models: Models such as GraphCM (Lin et al., 2022) and GLSM (Sun et al., 2023) construct explicit graphs over users, queries, documents, or items—integrating GNN layers to propagate both intra- and inter-session statistical signals, alleviating sparsity and cold-start. For example, GraphCM combines query and document graphs, with GNN+GRU encoders, and fuses attractiveness and examination in a weighted ExamHyp combination, achieving state-of-the-art log-likelihood and NDCG.
Long-term/short-term interest fusion: GLSM (Sun et al., 2023) retrieves multi-hop neighbors from user-specific and global interest graphs for long-term preferences, combines with scenario-aware RNNs for short-term interests, and adaptively fuses both.
UI/element-level models: SHA-SCP (Chen et al., 2023) employs multi-level (element/group) attention over clusters of UI elements in mobile settings, yielding $10$–$11$pp gains in Top-k accuracy over standard Transformers.

4. Multi-List, Carousel, and Complex UI Click Models

As platform interfaces advanced beyond ranked-lists, new model classes emerged:

Carousel/multilist interfaces: Models factor the examination and click probability across rows (carousels) and columns (within-rows). A position-based carousel model, as in (Leon-Martinez, 2023), posits

$P(E_{i,j}=1) = v_i w_j,\quad P(C_{i,j}=1|E_{i,j}) = \alpha_{r_{i,j}}$

with $v_i$ and $w_j$ learnt (e.g., from eye-tracking data).

Generalization to arbitrary block layouts: The F-shape Click Model (FSCM) (Fu et al., 2022) for multi-block mobile UIs uses a DAG over items reflecting all potential examination paths (vertical/horizontal/block skips), with separate GRU parameters per block type and indegree. Non-sequential comparison modules capture item toggling/sidewise comparisons observed in user studies.
Taxonomy and design for complex UIs: The first comprehensive taxonomy of grid and carousel click models, spanning all dependencies (topics, items, clicks), is introduced in (Kang et al., 23 Jun 2025).

Category	KDC1 (Global)	KDC2 (Sequentiality)	KDC3 (Factorization)	Example
Items-Clicks	$\{Y, C'\}$	prev. items/clicks	Mul (examination×relevance)	Cascade Model, GRU-based NCM
Topics-Items	$\{T, Y\}$	block/row+col	Mul (topic×item)	Carousel PBM (Leon-Martinez, 2023), new two-dim design (Kang et al., 23 Jun 2025)
Fully-Dep.	$\{T, Y, C'\}$	all items/clicks/themes	Transformer/self-att.	FE-TCM, multi-block self-attn

Guidance from behavioral data: Eye-tracking studies (Leon-Martinez, 2023, Fu et al., 2022) establish that users rarely follow pure top-to-bottom or left-to-right scans, instead showing block skips, genre/topic-first focus, and repeated comparisons ("toggling"). These patterns are reflected in model structures such as block-skip edges and non-sequential GRU updates.

5. Inference, Evaluation, and Practical Gains

Model fitting, evaluation, and deployment rely on:

Efficient inference and training: Classical models support one-pass, closed-form estimation or IO-HMM-based EM (Govindaraj et al., 2014, Ruijt et al., 2021). Neural and graph models are optimized via SGD or Adam and regularized cross-entropy.
Offline metrics: Click log-likelihood, perplexity (lower is better), NDCG@k for ranking, AUC for relevance discrimination. Examples: a unified model (Govindaraj et al., 2014) improves perplexity over DBN, AUC by more than 8pp.
Online effects: Industrial deployment, as in GLSM (Sun et al., 2023) and HCCM (Chen et al., 2022), shows 4–5% CTR and 2–4% GMV lifts, and multiple percentage points absolute improvement in Top-k accuracy for mobile UI systems (Chen et al., 2023, Zhou et al., 2021).
Ablations and robustness: Systematic ablations reveal text and position features, graph propagation, and fusion mechanisms as critical to model gains.
Realistic click simulation and benchmarking: For interactive scenarios, RClicks (Antonov et al., 2024) demonstrates that human click patterns are not well-captured by simple heuristics (e.g., "center of error"). Instead, learned probabilistic clickability maps improve robustness and stability assessment of segmentation and annotation systems.

6. Applications, Impact, and Emerging Directions

User-click models are foundational across information retrieval, recommender systems, e-commerce personalization, mobile UX, and more:

Ranking and learning-to-rank: Click models inform debiasing in empirical risk minimization via inverse propensity scoring (IPS) (Leon-Martinez, 2023), enabling unbiased learning from click data despite severe position or context bias.
CTR and engagement prediction: Advanced models fuse visual, textual, long-term/short-term, and graph-based signals for industrial-scale CTR systems (Chen et al., 2022, Sun et al., 2023).
Mobile and UI adaptation: Modeling spatial, type, and layout features achieves large user-experience improvements, e.g., in accessibility or tap prediction for smartphone interfaces (Zhou et al., 2021, Chen et al., 2023).
Grid and image search: GUBM provides position-bias correction for dense or sparse image search result pages, generalizing to hover and 2D paths (Xie et al., 2018).
Privacy and on-device modeling: Models like APM (Ou et al., 2021) run client-side, preserving privacy while capturing complex tab/branch behaviors.
Theory and model design: The Key-Design Choice taxonomy (Kang et al., 23 Jun 2025) enables formal comparison and principled extension to novel interfaces (e.g., topic–item–click models for carousels, or inclusion of dwell/scroll signals).

7. Limitations, Challenges, and Future Research

Despite their breadth, user-click models face several ongoing challenges:

Generalization to new UI paradigms: Mobile, block, and carousel UIs require multi-dimensional, possibly personalized, bias parameters and structures.
Robustness to interaction noise: E.g., segmentation models are sensitive to realistic, non-ideal user click patterns (Antonov et al., 2024); future benchmarks must embrace this variability.
Cold-start and sparsity: Graph-enhanced methods and clustering improve, but truly zero-shot modeling remains an open problem (Lin et al., 2022, Sun et al., 2023).
Explainability vs. predictive power: Increasingly non-factorized neural architectures raise interpretability challenges; integrating explicit factorization remains a promising direction.
Unified benchmarks and theory: The formalization of design axes provides a guide for integrating context, time, modality, and evolving behavior as click modeling targets novel environments (Kang et al., 23 Jun 2025).

User-click modeling thus continues to be a dynamic research domain, central to both theoretical advances in user modeling and large-scale, high-value applications in commercial systems.