Papers
Topics
Authors
Recent
Search
2000 character limit reached

Generative Search & Recommendation

Updated 30 December 2025
  • Generative search and recommendation are advanced paradigms that reframe retrieval and ranking as sequence generation over semantic IDs, enabling enhanced personalization.
  • They leverage large Transformer architectures and multimodal models to convert user queries or histories into tailored document or product identifiers.
  • Practical evaluations show improvements in metrics like CTR, GMV, and NDCG, while challenges remain in scalability, dynamic corpus handling, and ensuring factual accuracy.

Generative search and recommendation refer to information access paradigms in which large generative models—including LLMs, multimodal generators, and sequence models—are tasked with directly producing relevant item or document identifiers (IDs), textual queries, or even entirely novel content tailored to user intent, replacing traditional discriminative retrieval, ranking, and recommendation pipelines. These approaches reframe the matching task in recommender systems and search engines as sequence generation over semantic or numerical IDs, natural-language queries, or multimodal tokens, enhancing flexibility, personalization, and adaptability in handling rich user contexts and dynamic corpora (Li et al., 2024, Rajput et al., 2023, Liu et al., 19 Oct 2025, Shi et al., 8 Apr 2025).

1. Conceptual Foundations and Unified Frameworks

Generative paradigms depart from the classical retrieve-then-rank architectures found in large-scale search and recommendation systems. Instead of embedding queries/items and employing nearest-neighbor search over high-dimensional codebases, generative methods (a) encode users/queries into flexible prompts, (b) task a generative backbone (e.g., decoder-only Transformer, encoder-decoder LLM, multimodal foundation model) to autoregressively produce a sequence—a document/item ID, keyword, query suggestion, or semantic annotation—that is mapped to actual items or documents (Li et al., 2024, Shi et al., 8 Apr 2025, Gao et al., 26 Sep 2025, Chen et al., 8 Sep 2025).

This approach enables several unified frameworks:

2. Representation: Semantic Identifiers and Codebooks

A central challenge for generative search and recommendation is representing items and documents in a way that is both efficient for generation and semantically meaningful. Recent approaches introduce “Semantic IDs” (SIDs)—compact, discrete sequences obtained by quantizing content or multimodal embeddings (often via residual K-means, VQ-VAE, or related quantizers) (Penha et al., 14 Aug 2025, Rajput et al., 2023, Zhang et al., 19 Sep 2025, Ju et al., 29 Jul 2025, Shi et al., 8 Apr 2025, Chen et al., 8 Sep 2025).

  • Construction: Items are first mapped to continuous embeddings using encoders fine-tuned on semantic (search) and collaborative (recommendation) signals (Penha et al., 14 Aug 2025). These embeddings are quantized into multi-level codebooks, producing tuples such as [c1,c2,c3][c_1, c_2, c_3] that serve as the SIDs. Methods include RQ-KMeans, RQ-VAE, and exclusively semantic indexing via conflict-free code assignment (Zhang et al., 19 Sep 2025, Ju et al., 29 Jul 2025).
  • Joint S&R: Dual-purpose SIDs incorporate both semantic (query-based) and collaborative-filtering signals by concatenating code indices from separately optimized encoders, balancing the trade-off between relevance in both search and recommendation (Shi et al., 8 Apr 2025, Penha et al., 14 Aug 2025).
  • ID uniqueness is guaranteed through methods like exhaustive candidate matching (ECM) or recursive residual searching (RRS), preventing conflicts and random tie-breakers (Zhang et al., 19 Sep 2025).
  • In multimodal systems, such as product generation or fashion try-on, SIDs can represent content across text, image, or structured category trees, enabling flexible conditional generation (Ramisa et al., 2024, Gao et al., 16 Nov 2025).

3. Model Architectures and Decoding Algorithms

Generative S&R systems predominantly employ large Transformer-based architectures, often with one or more of the following features:

4. Multimodal Generative Search and Recommendation

Contemporary systems extend beyond textual prompts and IDs to multimodal conditioning:

  • Multi-modal architectures are formed by aligning product data from text, images, audio, and structured attributes (e.g., 3D layouts, segmentation masks) into a joint latent code ZZ, modeling both complementary ([GAN], [VAE], and [Diffusion]) and shared signals (Ramisa et al., 2024).
  • Generative models synthesize not only product IDs but actual novel items, images, or experiences, facilitating applications such as virtual try-on, "view in my room," or image-guided retrieval (Ramisa et al., 2024, Samaran et al., 2021, Guo et al., 2023).

5. Evaluation, Performance, and Practical Deployments

Evaluation protocols combine classical information retrieval and ranking metrics with novel metrics tailored for generative paradigms:

6. Open Problems, Future Directions, and Limitations

Despite rapid progress, several open challenges and research avenues persist:

Generative search and recommendation articulate a flexible, unified paradigm that bridges classical IR and recommender systems with large foundation models, semantic item representations, and user-centric context modeling, yielding enhanced adaptability, personalization, and performance across text and multimodal domains. Future work focuses on large-scale deployment, multimodal expansion, efficient dynamic corpora handling, interactive feedback integration, and robust evaluation metrics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Generative Search and Recommendation.