Deriving retrieval-friendly embeddings from SID-token representations

Develop a principled method for deriving retrieval-friendly item embeddings from the Semantic Identifier (SID) token representations learned by the NEO-trained decoder-only language model so that approximate k-nearest neighbor retrieval achieves competitive performance, for example by investigating projection layers, pooling/whitening, contrastive fine-tuning, or mixed-model approaches.

Background

The paper explores connecting NEO’s generative modeling with embedding-based retrieval by constructing item embeddings from learned SID token embeddings and applying approximate k-NN. In experiments, this approach underperforms both the production retrieval baseline and the generative NEO model. Alternative text-based embedding models were also evaluated without improvement.

The authors therefore leave a more principled approach to deriving retrieval-suitable embeddings from SID-token representations to future work, listing potential techniques such as projection layers, pooling/whitening, contrastive fine-tuning, or mixed models.

References

We leave a more principled study of deriving retrieval-friendly embeddings from SID-token representations (e.g., projection layers, pooling/whitening, contrastive fine-tuning, or mixed models) to future work.

A Unified Language Model for Large Scale Search, Recommendation, and Reasoning  (2603.17533 - Nadai et al., 18 Mar 2026) in Appendix, Unsuccessful attempts