PopSteer: Steering Popularity Bias in Recommenders

Updated 22 January 2026

PopSteer is a post-hoc framework that leverages a Sparse Autoencoder to steer latent user embeddings, effectively addressing popularity bias in recommender systems.
It integrates a pretrained SASRec backbone with controlled neuron adjustments using synthetic user profiles, enabling fine-grained modulation between popular and unpopular content.
Experimental results demonstrate high reconstruction fidelity with minimal NDCG loss, while significantly improving fairness metrics such as item coverage and the Gini index.

PopSteer is a post-hoc framework designed to interpret and mitigate popularity bias in recommender systems by leveraging a Sparse Autoencoder (SAE) as an interpretable interface to steer the latent representations of user preferences. Operating atop a frozen, pretrained recommender (specifically SASRec in reported experiments), PopSteer enables fine-grained modulation of recommendation exposure between head (popular) and tail (unpopular) content while maintaining transparency at the neuron level. The method relies on synthetic user profile generation, effect size-based neuron identification, and controlled adjustment of critical hidden activations to systematically shift recommendation distributions without retraining the underlying model (Ahmadov et al., 21 Jan 2026).

1. Model Architecture and Training Protocol

PopSteer uses a two-model pipeline: a pretrained sequential recommender (SASRec) and a post-hoc Sparse Autoencoder.

SASRec Backbone

User embedding dimension: $d = 64$
Architecture: 2-layer Transformer, 2 attention heads, feed-forward 256, dropout 0.5, GELU activations, layer norm $\epsilon = 10^{-12}$
Training: Adam optimizer ( $\mathrm{lr} = 10^{-3}$ ), batch size 2048, cross-entropy loss, early stopping (patience=10) on validation NDCG@10

Sparse Autoencoder (SAE)

Input dimension: $d$ (user embeddings from SASRec)
Hidden layer size: Overcomplete, $N \in \{32, 64, 96, 128\}$
Output dimension: $d$
Encoder: Linear map, $z = W_{\mathrm{enc}}^T(x - b_{\mathrm{pre}})$ , $W_{\mathrm{enc}} \in \mathbb{R}^{d \times N}$
Sparsification: $a = \mathrm{TopAct}_K(z)$ , retaining only top $K \in \{36, 40, 44, 48, 52, 56\}$ activations
Decoder: Linear reconstruction, $\hat{x} = W_{\mathrm{dec}} a + b_{\mathrm{pre}}$ , $W_{\mathrm{dec}} \in \mathbb{R}^{d \times N}$

Training Objective:

$\min_{W_{\mathrm{enc}},\,W_{\mathrm{dec}},\,b_{\mathrm{pre}}} \|x - \hat{x}\|_2^2 + \gamma\,\|e - \hat{e}\|_2^2$

where $e = x - \hat{x}$ , $z' = \mathrm{TopAct}_{\mathrm{aux}}(z)$ , $\hat{e} = W_{\mathrm{dec}} z'$ , and $\gamma$ (typically 0.1) is an auxiliary penalty for reviving dead neurons. No nonlinearities aside from TopAct are used for sparsification.

Reconstruction fidelity: Cosine similarity $> 0.98$ between original and reconstructed embeddings; NDCG@10 drop $< 0.3\%$ , confirming that the SAE preserves decision-critical information.

2. Identification of Popularity-Sensitive Neurons

PopSteer introduces a synthetic-neuron analysis phase to disambiguate popularity encoding within the SAE.

Synthetic Profile Generation

Popular set ( $\mathcal{I}^{\mathrm{Pop}}$ ): Top 10% most popular items in the interaction matrix $R$
Unpopular set ( $\mathcal{I}^{\mathrm{Unpop}}$ ): Bottom 10%
Synthetic datasets: $R^{\mathrm{Pop}}, R^{\mathrm{Unpop}}$ , each with $n' = 409{,}600$ profiles of length $M = 50$ (sampled with replacement), fed through the frozen SASRec to produce $x^{\mathrm{Pop}}, x^{\mathrm{Unpop}}$

Effect Size Computation (Cohen’s d)

For each hidden neuron $j$ , activations ( $z_j$ ) are aggregated for both synthetic populations:

$d_j = \frac{\mu_{j,\mathrm{Pop}} - \mu_{j,\mathrm{Unpop}}} {\sqrt{(\sigma_{j,\mathrm{Pop}}^2 + \sigma_{j,\mathrm{Unpop}}^2)/2}}$

where $\mu$ and $\sigma$ are the mean and standard deviation across synthetic users. Neurons with large positive $d_j$ specialize in popularity, while large negative $d_j$ indicate tail-specialization. Distributional assumptions are supported by observed near-Gaussianity of activations ( $78\%–92\%$ of neurons with $|\mathrm{skew}| < 0.5$ , $>92\%$ with $|\text{excess kurtosis}| < 1$ ).

3. Neuron Steering and Inference Intervention

At prediction time, PopSteer perturbs the SAE hidden activations of real user embeddings to counteract or enhance popularity signals.

Inference-Time Steering Procedure

Given $d_j$ from the analysis phase, for each user $u$ :

User embedding: $x_u = \mathrm{SASRec}(u)$
SAE pre-activation: $z = W_{\mathrm{enc}}^T(x_u - b_{\mathrm{pre}})$
Targeted adjustment:

$z_j' = \begin{cases} z_j - w_j \sigma_j, & \text{if } d_j > \beta \ z_j + w_j \sigma_j, & \text{if } d_j < -\beta \ z_j, & \text{otherwise} \end{cases}$

$\sigma_j$ is the standard deviation over synthetic profiles
$\beta$ (default: $1.0$) is the effect size threshold
Steering weight:

$w_j = \begin{cases} \alpha^{\mathrm{Pop}} \frac{|d_j|}{\max_i |d_i|}, & d_j > \beta \ \alpha^{\mathrm{Unpop}} \frac{|d_j|}{\max_i |d_i|}, & d_j < -\beta \end{cases}$

with $\alpha^{\mathrm{Pop}}, \alpha^{\mathrm{Unpop}} \in [1.0, 5.0]$ (recommended: $\alpha^{\mathrm{Pop}}=2.5, \alpha^{\mathrm{Unpop}}=3.0$ )

Sparsification: $a' = \mathrm{TopAct}_K(z')$
Steered embedding: $x_u' = W_{\mathrm{dec}} a' + b_{\mathrm{pre}}$
Item scoring: $s_{u,i} = \langle x_u', v_i \rangle$ or alternative model-specific scoring

Suppressing neurons with $d_j > \beta$ and boosting those with $d_j < -\beta$ systematically redirects recommendations away from head items toward the long tail, providing fine-grained control over the exposure distribution.

4. Experimental Design and Benchmarking

Experiments are conducted on three canonical datasets:

Dataset	Users	Items	Interactions	Core
MovieLens 1M	6,040	3,417	999,611	5-core
BeerAdvocate	10,464	13,907	1,395,865	5-core
Yelp	20,799	16,253	983,530	20-core

Data split: Chronological, leave-one-out for test, penultimate for validation.
SAE training: Adam, batch size 2048, typically converging in 10–20 epochs.
Baselines: Provider Max-Min Fair Re-ranking (P-MMF), Personalized Calibration Targets (PCT), Inverse Popularity Ranking (IPR), FA*IR, Dynamic User-oriented Rerank (DUOR), and Random re-ranking.
Tuning: Grid search on fairness/'knob' parameters to map full accuracy–fairness frontiers.

Metrics: NDCG@10 (overall and head/tail-decomposed), item coverage (unique items recommended ≥5 times), Gini index of exposure.

5. Results, Interpretability, and Sensitivity

Reconstruction and Fidelity

The SAE achieves reconstruction cosine similarity $> 0.98$ and maintains NDCG@10 drop $< 0.3\%$ , demonstrating near lossless user state transfer.

Fairness–Accuracy Trade-off

PopSteer dominates "nDCG vs Item Coverage" and "nDCG vs Gini" frontiers; on MovieLens 1M and BeerAdvocate, it outperforms all baselines; on Yelp, it provides superior fairness at a modest NDCG cost (less than 5%), contrasting with more severe accuracy declines in competing methods.

Interpretability Analyses

Activation profile: The most positive- $d$ neuron's top-10 users exhibit head-item shares $> 80\%$ ; most negative- $d$ neurons' corresponding users fallback to $< 20\%$ .
Controlled perturbation: Gini index decreases monotonically as more positive- $d$ neurons are suppressed; the opposite is observed for negative- $d$ manipulations.
Embedding visualization: UMAP projections reveal that, post-steering, real users migrate away from the popular cluster toward the tail-user cluster; Wasserstein distances confirm the shift ( $p<0.01$ ).

Ablation and Sensitivity

Random noise: Injecting Gaussian noise or randomly steering neurons yields no fairness improvement, confirming the necessity of Cohen's- $d$ -informed selection and variance-adaptive scaling.
Parameter effects: Increasing $\alpha^{\mathrm{Pop}}$ systematically reduces Gini index but sharply compromises NDCG@10 above $\sim 3$ ; $\alpha^{\mathrm{Unpop}}$ exerts milder influence, consistent with head-layer dominance in the backbone.

6. End-to-End PopSteer Pipeline and Reproducibility

High-level Pseudocode

for each synthetic profile in R_pop and R_unpop:
    x = SASRec(profile)
    z = W_enc.T @ (x - b_pre)
    # Accumulate mu_j,Pop, sigma_j,Pop and mu_j,Unpop, sigma_j,Unpop

for each neuron j:
    d_j = (mu_j,Pop - mu_j,Unpop) / sqrt((sigma_j,Pop**2 + sigma_j,Unpop**2) / 2)

for each real user u:
    x = SASRec(u)
    z = W_enc.T @ (x - b_pre)
    for each neuron j with |d_j| >= beta:
        if d_j > beta:
            w_j = alpha_Pop * |d_j| / max_i |d_i|
            z_j = z_j - w_j * sigma_j
        elif d_j < -beta:
            w_j = alpha_Unpop * |d_j| / max_i |d_i|
            z_j = z_j + w_j * sigma_j
    a = TopAct_K(z)
    x_steered = W_dec @ a + b_pre
    # Score items via s_{u,i} = dot(x_steered, v_i)

Hyperparameter Recommendations

$\beta = 1.0$
$\alpha^{\mathrm{Pop}} = 2.5$ , $\alpha^{\mathrm{Unpop}} = 3.0$
SAE: $N = 64, K = 48$
Synthetic profiles: $n' = 409,600, M = 50$

Capable of inference-time deployment, PopSteer requires only a single neuron-detection phase and no model retraining for subsequent steering. The method is adaptable for broader bias phenomena or other backbone architectures with no alteration of the core pipeline.

For further methodology, datasets, and extended analyses, see (Ahmadov et al., 21 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

From Insight to Intervention: Interpretable Neuron Steering for Controlling Popularity Bias in Recommender Systems (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to PopSteer.