A Review of Modern Recommender Systems Using Generative Models (Gen-RecSys)

Published 31 Mar 2024 in cs.IR and cs.AI | (2404.00579v2)

Abstract: Traditional recommender systems (RS) typically use user-item rating histories as their main data source. However, deep generative models now have the capability to model and sample from complex data distributions, including user-item interactions, text, images, and videos, enabling novel recommendation tasks. This comprehensive, multidisciplinary survey connects key advancements in RS using Generative Models (Gen-RecSys), covering: interaction-driven generative models; the use of LLMs (LLM) and textual data for natural language recommendation; and the integration of multimodal models for generating and processing images/videos in RS. Our work highlights necessary paradigms for evaluating the impact and harm of Gen-RecSys and identifies open challenges. This survey accompanies a tutorial presented at ACM KDD'24, with supporting materials provided at: https://encr.pw/vDhLq.

Abstract PDF HTML Upgrade to Chat

References (183)

Citations (22)

View on Semantic Scholar

Summary

The paper surveys key generative model approaches like VAEs, GANs, diffusion models, and LLMs applied to recommender systems.
It details methodologies including direct training and pretrained models for simulating user interactions and augmenting diverse data modalities.
The paper evaluates performance using traditional and generative metrics while addressing biases and societal impacts in RS applications.

Overview of Modern Recommender Systems Using Generative Models

The paper "A Review of Modern Recommender Systems Using Generative Models (Gen-RecSys)" offers an extensive survey of advancements in recommender systems (RS) empowered by generative models. It addresses the transformation from traditional collaborative filtering of user-item interactions to systems that leverage diverse data modalities.

Figure 1: Overview of the areas of interest in generative models in recommendation.

Foundational Aspects of Generative Models in RS

Generative Model Capabilities

Modern generative models, including GANs, VAEs, and LLMs, have demonstrated the ability to model complex data distributions, enabling advanced functionalities beyond traditional RS. These enable systems to process and generate textual, visual, and multimodal data, thus facilitating novel and interactive recommendation tasks. Generative models are employed in RS to simulate user interactions, augment datasets, and generate comprehensive content representations across modalities.

Interaction-Driven and Pretrained Model Applications

The recommended system's architecture employs two primary methodologies:

Directly Trained Models: These involve models like VAE-CF, trained on user interaction data without substantial pretraining datasets, predicting user preferences based on learned interaction probabilities.
Pretrained Models: These models utilize pre-existing, diverse datasets for tasks such as zero- and few-shot learning, fine-tuning for specific applications, retrieval-augmented generation, and embedding creation for complex data interactions.

Generative Model Frameworks

Auto-Encoding Mechanisms

Auto-encoders and their variational counterparts learn to reconstruct data from potentially corrupted inputs, efficiently handling denoising tasks for prediction enhancement. VAEs specifically excel in learning latent distributions, enabling superior generative capabilities in collaborative filtering and recommendation frameworks.

Generative Adversarial Networks (GANs)

GANs are pivotal in selecting training samples, synthesizing user preferences, and generating recommendation lists or panels. Their adversarial nature supports constructing robust models that generate realistic user-item interactions.

Diffusion Models

Newer to RS, diffusion models leverage forward and reverse processes to generate content, proving particularly effective in augmenting training sequences and addressing sparsity issues in sequential recommendation.

LLMs in Recommendation Systems

Encoder-only and Generative Recommendation

Encoder-only models focus on creating dense embeddings for retrieval tasks, enabling scalable recommendation solutions. In parallel, LLM-based generative systems utilize token sequences and prompts, making recommendations or predictions in zero-shot or few-shot contexts. Fine-tuning and prompt adjustments further customize generative outputs, expanding applicability to niche domains and performance consistency.

Hybrid Approaches and Conversations

Integrating LLMs with RS enables retrieval-augmented generation and input enrichment for RS functions. Furthermore, conversational interfaces, leveraging LLMs, redefine user interaction dynamics, facilitating richer dialogues and enhanced recommendation delivery.

Multimodal Generative Models

Motivations and Methodological Innovations

The paper underscores the necessity of integrating multimodal data, highlighting challenges in aligning diverse data spaces and the evolution of approaches like CLIP and ALBEF for improved data representation and retrieval. Generative diffusion models and multimodal pretraining advancements provide tangible enhancements, particularly in applications requiring synthesized outputs, such as virtual try-ons.

Evaluation of Gen-RecSys

Traditional and Holistic Metrics

The paper discusses employing conventional metrics like NDCG for quantitative evaluation and stresses innovations in generative metrics from NLP, including BLEU and ROUGE. It advocates for holistic evaluation practices considering long-term impacts, user satisfaction, and comprehensive online performance metrics.

Addressing Biases and Societal Implications

Addressing biases introduced by diverse datasets and the societal implications of automated RS, the paper recommends a structured evaluation of fairness, potential biases, and stakeholder effects, underscoring the importance of benchmarks like HE and cFairLLM.

Conclusion

Generative models are reshaping the recommendation landscape by introducing sophisticated, multimodal, and interactive capabilities. The evolution of RS towards leveraging advanced generative architectures underscores the necessity of ongoing research to address evaluation complexities, societal impact, and the integration of cutting-edge AI developments. Future directions highlight critical advancements in retrieval-augmented methodologies, tool-augmented conversational frameworks, and personalized content generation, paving the way for the next generation of intelligent, nuanced RS solutions.

Markdown