LLM2Vec-Gen: Generative Embeddings from Large Language Models

Published 11 Mar 2026 in cs.CL | (2603.10913v1)

Abstract: LLM-based text embedders typically encode the semantic content of their input. However, embedding tasks require mapping diverse inputs to similar outputs. Typically, this input-output is addressed by training embedding models with paired data using contrastive learning. In this work, we propose a novel self-supervised approach, LLM2Vec-Gen, which adopts a different paradigm: rather than encoding the input, we learn to represent the model's potential response. Specifically, we add trainable special tokens to the LLM's vocabulary, append them to input, and optimize them to represent the LLM's response in a fixed-length sequence. Training is guided by the LLM's own completion for the query, along with an unsupervised embedding teacher that provides distillation targets. This formulation helps to bridge the input-output gap and transfers LLM capabilities such as safety alignment and reasoning to embedding tasks. Crucially, the LLM backbone remains frozen and training requires only unlabeled queries. LLM2Vec-Gen achieves state-of-the-art self-supervised performance on the Massive Text Embedding Benchmark (MTEB), improving by 9.3% over the best unsupervised embedding teacher. We also observe up to 43.2% reduction in harmful content retrieval and 29.3% improvement in reasoning capabilities for embedding tasks. Finally, the learned embeddings are interpretable and can be decoded into text to reveal their semantic content.

Abstract PDF Upgrade to Chat

Summary

The paper introduces a generative embedding paradigm that leverages LLM latent responses to transfer safety, reasoning, and alignment into the embedding space.
The approach employs trainable special tokens and dual-objective optimization, yielding up to 9.3% MTEB score improvements over unsupervised baselines.
Experiments demonstrate enhanced text retrieval, reduced harmful content retrieval, and decodable embeddings that support transparency and interpretability.

LLM2Vec-Gen: Generative Embeddings from LLMs

Introduction and Motivation

Embedding models are fundamental to modern NLP, serving as the backbone for semantic search, retrieval, clustering, and ranking tasks. Traditional approaches derive embeddings directly from the input text, but this paradigm exposes a representational gap: semantically diverse queries with overlapping meanings remain distant, while contrastive learning requires extensive labeled paired datasets. "LLM2Vec-Gen: Generative Embeddings from LLMs" (2603.10913) proposes a novel generative embedding paradigm—constructing embeddings not from the input, but from the latent representation of the LLM's response to the input. This enables transfer of LLM capabilities such as safety alignment and multi-step reasoning into the embedding space, with only minimal adaptation and unlabeled data requirements.

Figure 1: Illustration of the input-output gap. Semantically distinct queries $q_1$ and $q_2$ belong to the same category (anger). Input-centric encoders place them far apart (yellow), but their LLM responses are more similar, yielding closer representations (green). LLM2Vec-Gen encodes the response rather than the input.

Methodology

LLM2Vec-Gen augments a frozen LLM with trainable special tokens—partitioned into "thought" and "compression" roles—inserted as a suffix to the input. The methodology proceeds in several decoupled phases:

Data Generation: Given only unlabeled queries, an LLM backbone produces candidate responses, simulating potential output texts for each input.
Target Representation: The generated responses are fed into an unsupervised encoder teacher (LLM2Vec), generating distilled embedding targets without any contrastive supervision.
Embedding Training: Special tokens are appended to the input. The output hidden states of the compression tokens are optimized with two objectives:
- Reconstruction ( $\mathcal{L}_\text{recon}$ ): Decoding the compression tokens' states should reconstruct the LLM's actual response.
- Embedding Alignment ( $\mathcal{L}_\text{align}$ ): The mean pooled, projected embeddings should match the teacher's embedding of the generated response.

Through this mechanism, the model learns to allocate the "embedding" space to capture the LLM's hypothetical response rather than the narrow semantics of the input text.

Figure 2: Overview of LLM2Vec-Gen: left—data generation and target embedding pipeline; right—append trainable tokens for dual-objective optimization with a frozen LLM backbone.

Empirical Evaluation

Experiments span multiple backbone families (Qwen-3, Qwen-2.5, Llama-3.x), model sizes (0.6B–8B), and target three core axes: general text embedding (MTEB v2), malicious content retrieval (AdvBench-IR), and reasoning-intensive retrieval (BRIGHT).

Key results:

Text Embedding Quality: LLM2Vec-Gen consistently outperforms prior unsupervised and zero-shot embedders (including LLM2Vec, Echo Embeddings, HyDE, GIRCSE, and InBedder) across all model sizes and backbone architectures. The relative improvement in MTEB average score reached up to 9.3% over the best unsupervised teacher for Qwen-3-8B.
Robustness and Safety: Against adversarial retrieval (AdvBench-IR), LLM2Vec-Gen demonstrated up to a 43.2% reduction in harmful retrieval rates compared with input-centric LLM2Vec. This is attributable to the fact that generative embeddings primarily encode the LLM's refusal output instead of the direct semantics of the malicious input.
Reasoning Transfer: On BRIGHT, LLM2Vec-Gen embeddings surpassed baseline LLM2Vec by up to 29.3% (in nDCG@10), indicating substantial transfer of reasoning skills into the embedding space.
Figure 3: MTEB average score as a function of model size across three model families. LLM2Vec-Gen consistently outperforms LLM2Vec across all model sizes and architectures.

An ablation study confirms both the necessity of dual-objective optimization and the importance of modeling both "thought" and "compression" token sets. Increasing the number of special tokens produces diminishing returns above 20, with optimal performance at the default split (10 thought, 10 compression).

Figure 4: Impact of special token count on MTEB-Lite performance. Performance improves from 2 to 20 tokens (default setting), with marginal gains beyond.

Interpretability and Model Analysis

A salient aspect of LLM2Vec-Gen is that its latent embeddings remain decodable. Leveraging the reconstruction objective and the original LM tokens, the compression token representations can be mapped back to textual outputs that closely resemble or paraphrase the original LLM response. This enables fine-grained analysis via tools such as Logit Lens and LatentLens, confirming that the model embeds semantic content aligned with the LLM's actual answer rather than the underlying input. For potentially unsafe queries, the embeddings map to tokens expressing refusal or safety—explicitly demonstrating that the method internalizes safety alignment at the representational level.

Comparison to Prior Work and Theoretical Framing

LLM2Vec-Gen is orthogonal to contrastive or prompt-based unsupervised approaches (SimCSE, Echo, GenEOL), which operate exclusively on the input, and to retrieve-then-generate methods (HyDE, InBedder, GIRCSE), which incur inference-time generation cost or require labeled training data. LLM2Vec-Gen's distilled generative foresight is conceptually analogous to JEPA [sobal2022jointembeddingpredictivearchitectures], but uniquely applies a dual-objective with frozen backbone, and distills latent space representations, bypassing the need for paired or contrastive data.

Practical and Theoretical Implications

This work demonstrates that large-scale LLMs can be efficiently repurposed as state-of-the-art embedders in self-supervised settings, with a parameter-efficient adaptation (only special tokens/projections learned, backbone frozen). At inference, a single forward pass with appended special tokens suffices for fast embedding, eliminating the cost of explicit response sampling adopted in generation-first retrieval schemes.

The generative embedding formulation produces robust, safety-aligned, and reasoning-capable representations that are especially valuable for deployed, safety-critical retrieval and RAG scenarios. Furthermore, the interpretability introduced by the reconstruction channel enables transparency and model auditing, supporting deployment in regulated or high-stakes domains.

The JEPA link opens the possibility for JEPA-style self-prediction and latent chaining, while the decodable latent tokens suggest a future direction for high-bandwidth, interpretable communication protocols among LLM-based agents—potentially enabling latent message passing with semantic decoding and human oversight.

Conclusion

"LLM2Vec-Gen" (2603.10913) presents a rigorous and scalable approach to generative embeddings, re-founding self-supervised text encoding on latent output representations rather than input modeling. This bridges the classical input-output gap, transfers critical LLM characteristics (including safety, reasoning, and alignment), and achieves SOTA performance in the absence of supervision. The method's design—frozen backbone, efficient distillation, robust interpretability—positions generative embeddings as a practical and theoretically motivated foundation for the next generation of flexible, safe, and powerful text encoders.

Markdown Report Issue