Papers
Topics
Authors
Recent
Search
2000 character limit reached

NineRec: A Benchmark Dataset Suite for Evaluating Transferable Recommendation

Published 14 Sep 2023 in cs.IR | (2309.07705v3)

Abstract: Large foundational models, through upstream pre-training and downstream fine-tuning, have achieved immense success in the broad AI community due to improved model performance and significant reductions in repetitive engineering. By contrast, the transferable one-for-all models in the recommender system field, referred to as TransRec, have made limited progress. The development of TransRec has encountered multiple challenges, among which the lack of large-scale, high-quality transfer learning recommendation dataset and benchmark suites is one of the biggest obstacles. To this end, we introduce NineRec, a TransRec dataset suite that comprises a large-scale source domain recommendation dataset and nine diverse target domain recommendation datasets. Each item in NineRec is accompanied by a descriptive text and a high-resolution cover image. Leveraging NineRec, we enable the implementation of TransRec models by learning from raw multimodal features instead of relying solely on pre-extracted off-the-shelf features. Finally, we present robust TransRec benchmark results with several classical network architectures, providing valuable insights into the field. To facilitate further research, we will release our code, datasets, benchmarks, and leaderboards at https://github.com/westlake-repl/NineRec.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021).
  2. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
  3. M6-Rec: Generative Pretrained Language Models are Open-Ended Recommender Systems. arXiv preprint arXiv:2205.08084 (2022).
  4. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  5. Zero-shot recommender systems. arXiv preprint arXiv:2105.08318 (2021).
  6. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
  7. Ilya Feige. 2019. Invariant-equivariant representation learning for multi-class data. arXiv preprint arXiv:1902.03251 (2019).
  8. Are we really making much progress? A worrying analysis of recent neural recommendation approaches. In Proceedings of the 13th ACM conference on recommender systems. 101–109.
  9. Exploring Adapter-based Transfer Learning for Recommender Systems: Empirical Studies and Practical Insights. arXiv preprint arXiv:2305.15036 (2023).
  10. Cross-domain recommendation without sharing user-relevant data. In The world wide web conference. 491–502.
  11. Kuairec: A fully-observed dataset for recommender systems. arXiv preprint arXiv:2202.10842 (2022).
  12. Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5). arXiv preprint arXiv:2203.13366 (2022).
  13. Learning image and user features for recommendation in social networks. In Proceedings of the IEEE international conference on computer vision. 4274–4282.
  14. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
  15. Vista: A Visually, Socially, and Temporally-aware Model for Artistic Recommendation. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM. https://doi.org/10.1145/2959100.2959152
  16. Ruining He and Julian McAuley. 2016a. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In proceedings of the 25th international conference on world wide web. 507–517.
  17. Ruining He and Julian McAuley. 2016b. VBPR: visual bayesian personalized ranking from implicit feedback. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.
  18. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web. 173–182.
  19. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).
  20. Learning Vector-Quantized Item Representation for Transferable Sequential Recommenders. arXiv preprint arXiv:2210.12316 (2022).
  21. Towards Universal Sequence Representation Learning for Recommender Systems. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 585–593.
  22. Conet: Collaborative cross networks for cross-domain recommendation. In Proceedings of the 27th ACM international conference on information and knowledge management. 667–676.
  23. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 2333–2338.
  24. Visually-aware fashion recommendation and design with generative image models. In 2017 IEEE international conference on data mining (ICDM). IEEE, 207–216.
  25. Vilt: Vision-and-language transformer without convolution or region supervision. In International Conference on Machine Learning. PMLR, 5583–5594.
  26. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30–37.
  27. Walid Krichene and Steffen Rendle. 2022. On sampled metrics for item recommendation. Commun. ACM 65, 7 (2022), 75–83.
  28. Transfer learning for collaborative filtering via a rating-matrix generative model. In Proceedings of the 26th annual international conference on machine learning. 617–624.
  29. Exploring the Upper Limits of Text-Based Collaborative Filtering Using Large Language Models: Discoveries and Insights. arXiv preprint arXiv:2305.11700 (2023).
  30. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
  31. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10012–10022.
  32. Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. 43–52.
  33. Scalable recommendation of wikipedia articles to editors using representation learning. arXiv preprint arXiv:2009.11771 (2020).
  34. Perceive your users in depth: Learning universal user representations from multiple e-commerce tasks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 596–605.
  35. Michael J Pazzani and Daniel Billsus. 2007. Content-based recommendation systems. In The adaptive web. Springer, 325–341.
  36. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748–8763.
  37. Improving language understanding by generative pre-training. (2018).
  38. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.
  39. Zero-shot text-to-image generation. In International Conference on Machine Learning. PMLR, 8821–8831.
  40. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012).
  41. On the difficulty of evaluating baselines: A study on recommender systems. arXiv preprint arXiv:1905.01395 (2019).
  42. One model to serve all: Star topology adaptive recommender for multi-domain ctr prediction. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 4104–4113.
  43. One4all user representation for recommender systems in e-commerce. arXiv preprint arXiv:2106.00573 (2021).
  44. Scaling Law for Recommendation Models: Towards General-purpose User Representations. arXiv preprint arXiv:2111.11294 (2021).
  45. Adversarial training towards robust multimedia recommender system. IEEE Transactions on Knowledge and Data Engineering 32, 5 (2019), 855–867.
  46. Chun-Hua Tsai and Peter Brusilovsky. 2019. Evaluating visual explanations for similarity-based recommendations: User perception and performance. In Proceedings of the 27th ACM Conference on User Modeling, Adaptation and Personalization. 22–30.
  47. Attention is all you need. Advances in neural information processing systems 30 (2017).
  48. TransRec: Learning Transferable Recommendation from Mixture-of-Modality Feedback. arXiv preprint arXiv:2206.06190 (2022).
  49. Emergent abilities of large language models. arXiv preprint arXiv:2206.07682 (2022).
  50. Model-agnostic counterfactual reasoning for eliminating popularity bias in recommender system. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1791–1800.
  51. Multi-Modal Self-Supervised Learning for Recommendation. arXiv preprint arXiv:2302.10632 (2023).
  52. Nüwa: Visual synthesis pre-training for neural visual world creation. In European Conference on Computer Vision. Springer, 720–736.
  53. Neural news recommendation with multi-head self-attention. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). 6389–6394.
  54. Empowering news recommendation with pre-trained language models. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1652–1656.
  55. PTUM: Pre-training User Model from Unlabeled User Behaviors via Self-supervision. arXiv preprint arXiv:2010.01494 (2020).
  56. Mind: A large-scale dataset for news recommendation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3597–3606.
  57. A hierarchical attention model for social contextual image recommendation. IEEE Transactions on Knowledge and Data Engineering 32, 10 (2019), 1854–1867.
  58. Training large-scale news recommenders with pretrained language models in the loop. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4215–4225.
  59. Personalized Showcases: Generating Multi-Modal Explanations for Recommendations. arXiv preprint arXiv:2207.00422 (2022).
  60. Diffusion models: A comprehensive survey of methods and applications. arXiv preprint arXiv:2209.00796 (2022).
  61. Parameter-efficient transfer from sequential behaviors for user modeling and recommendation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 1469–1478.
  62. One person, one model, one world: Learning continual user representation without forgetting. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 696–705.
  63. Tenrec: A Large-scale Multipurpose Benchmark Dataset for Recommender Systems. Advances in Neural Information Processing Systems 35 (2022), 11480–11493.
  64. Where to Go Next for Recommender Systems? ID- vs. Modality-based recommender models revisited. arXiv preprint arXiv:2303.13835 (2023).
  65. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068 (2022).
  66. Explainable recommendation: A survey and new perspectives. Foundations and Trends® in Information Retrieval 14, 1 (2020), 1–101.
  67. A deep framework for cross-domain and cross-system recommendations. arXiv preprint arXiv:2009.06215 (2020).
  68. Cross-domain recommendation: challenges, progress, and prospects. arXiv preprint arXiv:2103.01696 (2021).
  69. Personalized Transfer of User Preferences for Cross-domain Recommendation. arXiv preprint arXiv:2110.11154 (2021).
Citations (3)

Summary

  • The paper presents a novel benchmark dataset suite, NineRec, that enhances evaluation and training of transferable recommendation models with multimodal features.
  • It reports significant pre-tuning benefits, as models exhibit marked improvements on target datasets and reduce cold-start challenges.
  • The study signals a shift from classic ID-based methods to modality-driven approaches, paving the way for universal recommendation systems.

NineRec: Benchmark Dataset Suite for Transferable Recommendation

The paper "NineRec: A Benchmark Dataset Suite for Evaluating Transferable Recommendation" addresses a significant challenge in the field of recommender systems: the limited progress of transferable recommendation models, or "TransRec." These models are aimed at the development of one-for-all recommendation systems that leverage learning from one domain to predict in others. In stark contrast to the generalized success of foundational models in other AI fields, TransRec has lagged, hindered by the lack of large-scale benchmark datasets and the dominance of the ID-based recommendation paradigm.

Dataset Development

NineRec is introduced as a comprehensive dataset suite designed to enhance research on transferable recommendations by overcoming the dataset limitation. The suite comprises a substantial source domain dataset and nine diverse target domain datasets. This approach provides a benchmark for analyzing various TransRec models.

  • Source Dataset (Bili_2M): It contains millions of user-item interactions and is rich in multimodal features, each item being represented by high-resolution images and descriptive text.
  • Target Datasets: These include datasets collected from different vertical channels on a single platform and cross-platform datasets, ensuring a broad scope for evaluating transferability across diverse domains.

NineRec is a pioneering contribution for TransRec models as it emphasizes learning from raw appearance features (images and text), whereas historical datasets generally involved static, pre-extracted features. Key distinctions of the dataset include its high semantic complexity and its potential for studies on modality-based recommendations.

Implications of TransRec and Findings

TransRec paradigms fundamentally diverge from classical IDRec. By focusing on item modality features, TransRec can naturally achieve cross-domain capabilities, a leap towards universal recommendation models paralleling those in NLP and CV. The paper empirically demonstrates that models pre-trained on NineRec achieve notable improvements when fine-tuned on target datasets.

  • Pre-Tuning Effectiveness: Models training on NineRec source data show significant performance improvements over direct training on target datasets, particularly in text-based scenarios.
  • Cold Start Reduction: TransRec notably reduces cold-start challenges. Even without overlapping IDs, NineRec-trained models adapt effectively across datasets.
  • Comparison with ID-based Models: While traditionally dominated by IDRec, the TransRec performance surpasses IDRec even in non-cold-start settings when textual data is used, suggesting a pivotal shift.

Challenges and Future Directions

Despite its promise, building universal TransRec models involves challenges such as aligning and effectively fusing different modalities, ensuring scale, and addressing the inherently high computational costs associated with end-to-end training of large modality encoders. Moreover, while NineRec contributes substantially, a broader set of datasets and potentially larger-scale pre-training could further empower emergent capabilities—which remains a speculative but exciting avenue.

Conclusion

NineRec is a substantial advancement for TransRec research, recognized for its real-world applicability and ability to foster developments in recommendation systems akin to foundational models in NLP and CV. By facilitating pre-training on diverse multimodal data, NineRec provides a foundation for developing adaptable, one-for-all recommendation models, encouraging the exploration of cross-domain and cross-platform recommendations. As the field progresses, collaborations across NLP, CV, and recommendation researchers leveraging NineRec can catalyze developments in creating more robust, generalized recommendation systems.

Overall, this paper not only introduces a highly valuable benchmark suite but also empirically demonstrates its potential, setting the stage for future advancements in transferable recommendation models.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 6 likes about this paper.