DAS: Dual-Aligned Semantic IDs Empowered Industrial Recommender System

Published 14 Aug 2025 in cs.IR | (2508.10584v1)

Abstract: Semantic IDs are discrete identifiers generated by quantizing the Multi-modal LLMs (MLLMs) embeddings, enabling efficient multi-modal content integration in recommendation systems. However, their lack of collaborative signals results in a misalignment with downstream discriminative and generative recommendation objectives. Recent studies have introduced various alignment mechanisms to address this problem, but their two-stage framework design still leads to two main limitations: (1) inevitable information loss during alignment, and (2) inflexibility in applying adaptive alignment strategies, consequently constraining the mutual information maximization during the alignment process. To address these limitations, we propose a novel and flexible one-stage Dual-Aligned Semantic IDs (DAS) method that simultaneously optimizes quantization and alignment, preserving semantic integrity and alignment quality while avoiding the information loss typically associated with two-stage methods. Meanwhile, DAS achieves more efficient alignment between the semantic IDs and collaborative signals, with the following two innovative and effective approaches: (1) Multi-view Constrative Alignment: To maximize mutual information between semantic IDs and collaborative signals, we first incorporate an ID-based CF debias module, and then design three effective contrastive alignment methods: dual user-to-item (u2i), dual item-to-item/user-to-user (i2i/u2u), and dual co-occurrence item-to-item/user-to-user (i2i/u2u). (2) Dual Learning: By aligning the dual quantizations of users and ads, the constructed semantic IDs for users and ads achieve stronger alignment. Finally, we conduct extensive offline experiments and online A/B tests to evaluate DAS's effectiveness, which is now successfully deployed across various advertising scenarios at Kuaishou App, serving over 400 million users daily.

Abstract PDF Upgrade to Chat

Authors (6)

Summary

The paper proposes a one-stage DAS method that integrates quantization and collaborative filtering alignment to preserve semantic integrity in recommendations.
The methodology uses modules like UISM, ICDM, and MDAM to extract, debias, and align multi-modal embeddings, improving token coherence.
Extensive experiments demonstrate significant gains in AUC, UAUC, and GAUC, with industrial deployment at Kuaishou boosting CTR and eCPM, notably in cold-start scenarios.

DAS: Dual-Aligned Semantic IDs Empowered Industrial Recommender System

Introduction

The paper "DAS: Dual-Aligned Semantic IDs Empowered Industrial Recommender System" (2508.10584) addresses prominent limitations inherent in traditional recommendation systems by proposing a novel framework—Dual-Aligned Semantic IDs (DAS). Semantic IDs, discrete identifiers derived from quantizing embeddings of Multi-modal LLMs (MLLMs), offer substantial advantages in integrating multi-modal content within recommendation systems but often remain misaligned with collaborative filtering (CF) signals, consequently hindering recommendation efficacy. Recent approaches have leveraged multi-stage alignment mechanisms, which often incur significant information loss and reduce alignment flexibility. This paper introduces a one-stage DAS method, integrating quantization and alignment processes, fostering improved alignment precision, and preserving semantic integrity with collaborative signals.

Figure 1: Comparison of Semantic IDs construction. (1) No-Aligned, (2) Two-Stage Aligned and (3) Ours: One-Stage Dual-Aligned.

Methodology

DAS innovatively amalgamates quantization and alignment processes using a one-stage framework and is composed of three primary modules: User and Item Semantic Model (UISM), ID-based CF Debias Model (ICDM), and Multi-view Dual-Aligned Mechanism (MDAM). The UISM extracts and quantizes multi-modal semantic embeddings into hierarchical Semantic IDs using RQ-VAE, simultaneously reducing the risk of codebook collapse by employing K-means clustering for initialization. The ICDM module focuses on eliminating biases within CF signals—such as popularity and conformity—employing disentangling domain adaptation networks to enhance alignment with semantic models. Lastly, MDAM maximizes mutual information between semantic IDs and collaborative representations via three contrastive alignment strategies: dual user-to-item (u2i), dual item-to-item/user-to-user (i2i/u2u), and dual co-occurrence item-to-item/user-to-user (i2i/u2u).

Figure 2: The implementation of DAS. UISM module leverages the RQ-VAEs in quantization process, ICDM module uses a disentangled debiasing network to obtain unbiased CF representations, and during the co-training process of UISM and ICDM, alignment between the CF and Semantic IDs is achieved through MDAM module.

Results and Discussion

Extensive offline experiments demonstrate that DAS consistently achieves superior accuracy across various settings, significantly outperforming conventional two-stage alignment methods. DAS achieves remarkable improvements in AUC, UAUC, and GAUC through an efficient framework, integrating quantization methods with collaborative filtering signals leading to superior semantic representation alignment. Of particular note is DAS's capability to enhance semantic token coherence, resulting in improved model performance across both cold-start and regular recommendation scenarios.

Figure 3: In-depth analysis results of DAS. (a) and (b) present the vector retrieval evaluation results for the MDAM module, while (c) and (d) illustrate the code assignment distribution, revealing the performance of the learned codebook.

Deployment of DAS within Kuaishou demonstrates its industrial applicability, comprehensively enhancing CTR prediction accuracy, yielding substantial gains in eCPM—up to 8.98%—particularly in cold-start scenarios. This deployment encompasses tens of millions of users, illustrating DAS's scalability and business value.

Figure 4: The online deployment pipeline of DAS at Kuaishou.

Conclusion

The paper introduces a robust one-stage framework—Dual-Aligned Semantic IDs (DAS)—that optimizes alignment between semantic and collaborative filtering representations within recommendation systems. DAS's innovative integration of quantization and alignment processes minimizes semantic integrity loss, ensuring enhanced mutual information between embeddings. Such improvements are validated through extensive offline and online experiments, showcasing significant predictive performance improvements and practical application scalability. The deployment across various advertising scenarios at Kuaishou cements its strong business value, discerning DAS as a pivotal advancement in recommendation systems, aligning multi-modal representation learning with collaborative filtering accuracy effectively.

Markdown Report Issue