PopSteer: Steering Popularity Bias in Recommenders
- PopSteer is a post-hoc framework that leverages a Sparse Autoencoder to steer latent user embeddings, effectively addressing popularity bias in recommender systems.
- It integrates a pretrained SASRec backbone with controlled neuron adjustments using synthetic user profiles, enabling fine-grained modulation between popular and unpopular content.
- Experimental results demonstrate high reconstruction fidelity with minimal NDCG loss, while significantly improving fairness metrics such as item coverage and the Gini index.
PopSteer is a post-hoc framework designed to interpret and mitigate popularity bias in recommender systems by leveraging a Sparse Autoencoder (SAE) as an interpretable interface to steer the latent representations of user preferences. Operating atop a frozen, pretrained recommender (specifically SASRec in reported experiments), PopSteer enables fine-grained modulation of recommendation exposure between head (popular) and tail (unpopular) content while maintaining transparency at the neuron level. The method relies on synthetic user profile generation, effect size-based neuron identification, and controlled adjustment of critical hidden activations to systematically shift recommendation distributions without retraining the underlying model (Ahmadov et al., 21 Jan 2026).
1. Model Architecture and Training Protocol
PopSteer uses a two-model pipeline: a pretrained sequential recommender (SASRec) and a post-hoc Sparse Autoencoder.
SASRec Backbone
- User embedding dimension:
- Architecture: 2-layer Transformer, 2 attention heads, feed-forward 256, dropout 0.5, GELU activations, layer norm
- Training: Adam optimizer (), batch size 2048, cross-entropy loss, early stopping (patience=10) on validation NDCG@10
Sparse Autoencoder (SAE)
- Input dimension: (user embeddings from SASRec)
- Hidden layer size: Overcomplete,
- Output dimension:
- Encoder: Linear map, ,
- Sparsification: , retaining only top activations
- Decoder: Linear reconstruction, ,
Training Objective:
where , , , and (typically 0.1) is an auxiliary penalty for reviving dead neurons. No nonlinearities aside from TopAct are used for sparsification.
Reconstruction fidelity: Cosine similarity between original and reconstructed embeddings; NDCG@10 drop , confirming that the SAE preserves decision-critical information.
2. Identification of Popularity-Sensitive Neurons
PopSteer introduces a synthetic-neuron analysis phase to disambiguate popularity encoding within the SAE.
Synthetic Profile Generation
- Popular set (): Top 10% most popular items in the interaction matrix
- Unpopular set (): Bottom 10%
- Synthetic datasets: , each with profiles of length (sampled with replacement), fed through the frozen SASRec to produce
Effect Size Computation (Cohen’s d)
For each hidden neuron , activations () are aggregated for both synthetic populations:
where and are the mean and standard deviation across synthetic users. Neurons with large positive specialize in popularity, while large negative indicate tail-specialization. Distributional assumptions are supported by observed near-Gaussianity of activations ( of neurons with , with ).
3. Neuron Steering and Inference Intervention
At prediction time, PopSteer perturbs the SAE hidden activations of real user embeddings to counteract or enhance popularity signals.
Inference-Time Steering Procedure
Given from the analysis phase, for each user :
- User embedding:
- SAE pre-activation:
- Targeted adjustment:
- is the standard deviation over synthetic profiles
- (default: $1.0$) is the effect size threshold
- Steering weight:
with (recommended: )
- Sparsification:
- Steered embedding:
- Item scoring: or alternative model-specific scoring
Suppressing neurons with and boosting those with systematically redirects recommendations away from head items toward the long tail, providing fine-grained control over the exposure distribution.
4. Experimental Design and Benchmarking
Experiments are conducted on three canonical datasets:
| Dataset | Users | Items | Interactions | Core |
|---|---|---|---|---|
| MovieLens 1M | 6,040 | 3,417 | 999,611 | 5-core |
| BeerAdvocate | 10,464 | 13,907 | 1,395,865 | 5-core |
| Yelp | 20,799 | 16,253 | 983,530 | 20-core |
- Data split: Chronological, leave-one-out for test, penultimate for validation.
- SAE training: Adam, batch size 2048, typically converging in 10–20 epochs.
- Baselines: Provider Max-Min Fair Re-ranking (P-MMF), Personalized Calibration Targets (PCT), Inverse Popularity Ranking (IPR), FA*IR, Dynamic User-oriented Rerank (DUOR), and Random re-ranking.
- Tuning: Grid search on fairness/'knob' parameters to map full accuracy–fairness frontiers.
Metrics: NDCG@10 (overall and head/tail-decomposed), item coverage (unique items recommended ≥5 times), Gini index of exposure.
5. Results, Interpretability, and Sensitivity
Reconstruction and Fidelity
- The SAE achieves reconstruction cosine similarity and maintains NDCG@10 drop , demonstrating near lossless user state transfer.
Fairness–Accuracy Trade-off
- PopSteer dominates "nDCG vs Item Coverage" and "nDCG vs Gini" frontiers; on MovieLens 1M and BeerAdvocate, it outperforms all baselines; on Yelp, it provides superior fairness at a modest NDCG cost (less than 5%), contrasting with more severe accuracy declines in competing methods.
Interpretability Analyses
- Activation profile: The most positive- neuron's top-10 users exhibit head-item shares ; most negative- neurons' corresponding users fallback to .
- Controlled perturbation: Gini index decreases monotonically as more positive- neurons are suppressed; the opposite is observed for negative- manipulations.
- Embedding visualization: UMAP projections reveal that, post-steering, real users migrate away from the popular cluster toward the tail-user cluster; Wasserstein distances confirm the shift ().
Ablation and Sensitivity
- Random noise: Injecting Gaussian noise or randomly steering neurons yields no fairness improvement, confirming the necessity of Cohen's--informed selection and variance-adaptive scaling.
- Parameter effects: Increasing systematically reduces Gini index but sharply compromises NDCG@10 above ; exerts milder influence, consistent with head-layer dominance in the backbone.
6. End-to-End PopSteer Pipeline and Reproducibility
High-level Pseudocode
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
for each synthetic profile in R_pop and R_unpop: x = SASRec(profile) z = W_enc.T @ (x - b_pre) # Accumulate mu_j,Pop, sigma_j,Pop and mu_j,Unpop, sigma_j,Unpop for each neuron j: d_j = (mu_j,Pop - mu_j,Unpop) / sqrt((sigma_j,Pop**2 + sigma_j,Unpop**2) / 2) for each real user u: x = SASRec(u) z = W_enc.T @ (x - b_pre) for each neuron j with |d_j| >= beta: if d_j > beta: w_j = alpha_Pop * |d_j| / max_i |d_i| z_j = z_j - w_j * sigma_j elif d_j < -beta: w_j = alpha_Unpop * |d_j| / max_i |d_i| z_j = z_j + w_j * sigma_j a = TopAct_K(z) x_steered = W_dec @ a + b_pre # Score items via s_{u,i} = dot(x_steered, v_i) |
Hyperparameter Recommendations
- ,
- SAE:
- Synthetic profiles:
Capable of inference-time deployment, PopSteer requires only a single neuron-detection phase and no model retraining for subsequent steering. The method is adaptable for broader bias phenomena or other backbone architectures with no alteration of the core pipeline.
For further methodology, datasets, and extended analyses, see (Ahmadov et al., 21 Jan 2026).