Papers
Topics
Authors
Recent
Search
2000 character limit reached

RAG-Personalization Overview

Updated 28 January 2026
  • RAG-Personalization is the systematic adaptation of retrieval-augmented generation models using explicit or implicit user signals to tailor query reformulation, retrieval prioritization, and output generation.
  • The approach combines methods like personalized query rewriting, collaborative filtering, and RL-based tuning to improve relevance and user-specific content delivery.
  • Empirical evaluations report significant improvements in metrics such as ROUGE-L and F1, demonstrating enhanced performance in personalized response generation and document retrieval.

Retrieval-Augmented Generation Personalization (RAG-Personalization) is the systematic adaptation of retrieval-augmented LLM systems to individual users by conditioning retrieval, prompt construction, and/or generation on user-specific signals. These signals may include explicit user profiles, latent embeddings of behavioral history, dynamically learned preferences, or contextual behavioral traces. The field encompasses text, multi-modal, and agentic architectures spanning response generation, question answering, recommendation, and dialog systems, and draws on techniques from information retrieval, reinforcement learning, user modeling, and knowledge augmentation.

1. Foundations and Taxonomy of RAG-Personalization

A personalized RAG system is typically factored into three key stages, each amenable to user adaptation (Li et al., 14 Apr 2025):

  • Pre-retrieval (Query Reformulation/Expansion): Operator Q(q,p)\mathcal{Q}(q,p) rewrites or expands the user query qq conditioned on profile pp to create q∗q^*.
  • Retrieval: Operator R(q∗,C,p)\mathcal{R}(q^*,\mathcal{C},p) ranks and filters documents from corpus C\mathcal{C} using q∗q^* and pp to return user-relevant knowledge D∗D^*.
  • Generation: Operator G(D∗,prompt,p;θ)\mathcal{G}(D^*,\texttt{prompt},p;\theta) generates output text gg using the retrieved context, user profile, and prompt.

Formally, the pipeline is:

g=G(R(Q(q,p),C,p),prompt,p;θ)g = \mathcal{G}(\mathcal{R}(\mathcal{Q}(q,p),\mathcal{C},p), \texttt{prompt}, p; \theta)

Personalization may be explicit, such as concatenating profile text into prompts (Shi et al., 2024), or implicit, such as optimizing retrieval or generation with learned user embeddings (Shi et al., 8 Apr 2025, Salemi et al., 2024). Hybrid approaches combine both (Yazan et al., 24 Mar 2025, Zhang et al., 10 Aug 2025). RAG-personalization also extends to multi-modal (vision-language) models (Seifi et al., 4 Feb 2025) and agent planning loops (Li et al., 14 Apr 2025, Platnick et al., 29 Sep 2025).

2. Personalization Mechanisms Across RAG Stages

2.1 Pre-Retrieval: Personalized Query Expansion and Rewriting

Personalized query expansion addresses intra-user semantic drift and style variance. Techniques include LLM-based expansion with user context (as in PBR: "Personalize Before Retrieve") which applies style-aligned pseudo-relevance feedback from user history and graph-based alignment to capture corpus structure (Zhang et al., 10 Oct 2025). The expansion shifts the raw query from qq to a personalized vector q∗=q+Δuser(q,C)q^* = q + \Delta_\text{user}(q, C), where Δuser\Delta_\text{user} fuses style, reasoning, and structural anchors.

Query rewriting may leverage explicit user attributes, in-session behavioral signals, or inferred preferences via prompting or plug-in models (Li et al., 14 Apr 2025, Shi et al., 2024).

2.2 Retrieval: Personalized Indexing, Ranking, and Collaborative Filtering

Retrieval is adapted by constructing user-specific document pools (e.g., local histories PuP_u) (Salemi et al., 2024), community-aware knowledge graphs (Liang et al., 21 Nov 2025), collaborative retrieval from nearest-neighbor users (as in CFRAG) (Shi et al., 8 Apr 2025), and scoring with user-profile or session relevance:

score(d∣q,p)=α sim(enc(q,p), enc(d))+(1−α) profileSim(p,d)\mathrm{score}(d|q,p) = \alpha\,\mathrm{sim}(\mathrm{enc}(q,p),\,\mathrm{enc}(d)) + (1-\alpha)\,\mathrm{profileSim}(p,d)

Collaborative filtering augments retrieval pools with similar users' histories, using contrastive user encoders to select neighbors and personalized retriever/reranker architectures conditioned on both query and user preference vectors (Shi et al., 8 Apr 2025).

Structured knowledge (e.g., KG paths in recommendations (Azizi et al., 9 Jun 2025) or identity graphs for agents (Platnick et al., 29 Sep 2025)) is incorporated either as subgraph retrieval or as subgraph summaries injected into prompts.

2.3 Generation: Conditioning, Reward Optimization, and Procedural Schemas

Generation adapts by prompt engineering (explicitly inserting user profiles, preferences, or graph summaries) (Shi et al., 2024, Arabi et al., 2024, Liang et al., 21 Nov 2025), prefix-tuning with learned personal tokens (Tangarajan et al., 4 Aug 2025), LoRA-based parameter-efficient fine-tuning (Salemi et al., 2024), or direct optimization through RL with personalization reward (Zhang et al., 10 Aug 2025). Some systems interleave zero-hot personalized features (sentiment, frequent word lists) and contrastive examples from other users to make the model more discriminative (Yazan et al., 24 Mar 2025).

Explicit reasoning steps, such as PrLM's reasoning-and-answer separation with a personalization-guided contrastive reward, enable LLMs to learn to selectively leverage user profiles in output (Zhang et al., 10 Aug 2025).

Procedural personalization (e.g., multi-step therapy scripts in Habit Coach) places procedural knowledge scaffolds directly into the prompt, allowing for dynamic slot-filling and stateful dialogues (Arabi et al., 2024).

3. Architectures: End-to-End, Module-Oriented, and Agentic Models

Personalization is embedded in both modular RAG pipelines and unified end-to-end systems.

  • Module-oriented Architectures: Many frameworks, e.g., ERAGent (Shi et al., 2024), CFRAG (Shi et al., 8 Apr 2025), PersonaRAG (Zerhoudi et al., 2024), implement personalization via discrete subsystems—retrievers, re-rankers, generators—coordinated by passing user context into the prompt or as scoring features.
  • Unified End-to-End Models: Systems like UniMS-RAG encode source selection, retrieval, and generation as sequence tasks in a single Transformer, with acting and evaluation tokens bridging stages. Learned self-refinement loops allow for iterative relevance and consistency optimization (Wang et al., 2024, Li et al., 14 Apr 2025).
  • Agentic Frameworks: Agent-centric methods such as ID-RAG (Platnick et al., 29 Sep 2025) and PersonaRAG (Zerhoudi et al., 2024) treat the user or persona as a dynamic knowledge graph, retrieve structured identity elements at decision time, and condition the agent's action selection on this retrieved context.
  • Multi-modal and Vision-Language Systems: Training-free frameworks (e.g., PeKit (Seifi et al., 4 Feb 2025)) implement RAG-personalization by constructing embedding banks of personalized visual instances for instance-aware retrieval and prompt construction at inference.

4. Empirical Effects, Evaluation Metrics, and Benchmarks

Quantitative evaluation consistently shows significant gains from RAG-personalization. Metrics include:

Benchmarks:

Results summary (selected findings):

5. Systemic and Methodological Challenges

  • Cold Start: Profile- or history-dependent methods underperform for new users; RAG is more sample-efficient than PEFT for users with little data (Salemi et al., 2024, Zhang et al., 10 Oct 2025).
  • Computational Scaling: Multi-agent or multi-retriever systems (e.g., PersonaRAG (Zerhoudi et al., 2024)) incur high computational costs due to multiple LLM invocations and large prompt assemblies.
  • Privacy: Local storage and sandboxed retrieval minimize cross-user data leakage. PEFT risks memorizing rare user patterns; RAG risks inadvertent exposure of raw snippets (Salemi et al., 2024).
  • Consistency and Coherence: Long-horizon agents must avoid identity drift; identity-graph–based retrieval (ID-RAG) shows promise in maintaining persona alignment over time (Platnick et al., 29 Sep 2025).
  • Evaluation: Standard metrics do not fully capture long-term adaptation, user satisfaction, or preference alignment; interactive and qualitative benchmarks are needed (Li et al., 14 Apr 2025).
  • Procedural vs. Declarative Knowledge: For dialog and guidance systems, procedural prompt design is essential for phase-aware, personalized interactions (Arabi et al., 2024).

6. Future Directions and Open Problems


Key References:

Definition Search Book Streamline Icon: https://streamlinehq.com
References (16)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to RAG-Personalization.