Cross-Modal Content Inference and Feature Enrichment for Cold-Start Recommendation
Abstract: Multimedia recommendation aims to fuse the multi-modal information of items for feature enrichment to improve the recommendation performance. However, existing methods typically introduce multi-modal information based on collaborative information to improve the overall recommendation precision, while failing to explore its cold-start recommendation performance. Meanwhile, these above methods are only applicable when such multi-modal data is available. To address this problem, this paper proposes a recommendation framework, named Cross-modal Content Inference and Feature Enrichment Recommendation (CIERec), which exploits the multi-modal information to improve its cold-start recommendation performance. Specifically, CIERec first introduces image annotation as the privileged information to help guide the mapping of unified features from the visual space to the semantic space in the training phase. And then CIERec enriches the content representation with the fusion of collaborative, visual, and cross-modal inferred representations, so as to improve its cold-start recommendation performance. Experimental results on two real-world datasets show that the content representations learned by CIERec are able to achieve superior cold-start recommendation performance over existing visually-aware recommendation algorithms. More importantly, CIERec can consistently achieve significant improvements with different conventional visually-aware backbones, which verifies its universality and effectiveness.
- Y. Zhang, Y. Liu, Y. Xu, H. Xiong, C. Lei, W. He, L. Cui, and C. Miao, “Enhancing sequential recommendation with graph contrastive learning,” arXiv preprint arXiv:2205.14837, 2022.
- X. Du, X. Wang, X. He, Z. Li, J. Tang, and T.-S. Chua, “How to learn item representation for cold-start multimedia recommendation?” in Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 3469–3477.
- X. He, K. Deng, X. Wang, Y. Li, Y. Zhang, and M. Wang, “Lightgcn: Simplifying and powering graph convolution network for recommendation,” in Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020, pp. 639–648.
- F. Sun, J. Liu, J. Wu, C. Pei, X. Lin, W. Ou, and P. Jiang, “Bert4rec: Sequential recommendation with bidirectional encoder representations from transformer,” in Proceedings of the 28th ACM international conference on information and knowledge management, 2019, pp. 1441–1450.
- S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, “Bpr: Bayesian personalized ranking from implicit feedback,” arXiv preprint arXiv:1205.2618, 2012.
- R. He and J. McAuley, “Vbpr: visual bayesian personalized ranking from implicit feedback,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, no. 1, 2016.
- H. Ma, X. Li, L. Meng, and X. Meng, “Comparative study of adversarial training methods for cold-start recommendation,” in Proceedings of the 1st International Workshop on Adversarial Learning for Multimedia, ser. ADVM ’21, 2021, p. 28–34.
- Y. Wei, X. Wang, Q. Li, L. Nie, Y. Li, X. Li, and T.-S. Chua, “Contrastive learning for cold-start recommendation,” in Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 5382–5390.
- X. Yu, L. Meng, X. Tian, S. Fauvel, B. Huang, Y. Guan, Z. Shen, C. Miao, and C. Leung, “Usability analysis of the novel functions to assist the senior customers in online shopping,” in Social Computing and Social Media. User Experience and Behavior, G. Meiselwitz, Ed., 2018.
- H. Wu, X. Chen, X. Li, H. Ma, Y. Zheng, X. Li, X. Meng, and L. Meng, “A visually-aware food analysis system for diet management,” in 2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW). IEEE, 2022, pp. 1–1.
- J. Yu, M. Gao, J. Li, H. Yin, and H. Liu, “Adaptive implicit friends identification over heterogeneous network for social recommendation,” in Proceedings of the 27th ACM international conference on information and knowledge management, 2018, pp. 357–366.
- C. Shi, B. Hu, W. X. Zhao, and S. Y. Philip, “Heterogeneous information network embedding for recommendation,” IEEE Transactions on Knowledge and Data Engineering, vol. 31, no. 2, pp. 357–370, 2018.
- D.-K. Chae, J.-S. Kang, S.-W. Kim, and J. Choi, “Rating augmentation with generative adversarial networks towards accurate collaborative filtering,” in The World Wide Web Conference, 2019, pp. 2616–2622.
- H. Lee, J. Im, S. Jang, H. Cho, and S. Chung, “Melu: Meta-learned user preference estimator for cold-start recommendation,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 1073–1082.
- O. Barkan, N. Koenigstein, E. Yogev, and O. Katz, “Cb2cf: a neural multiview content-to-collaborative filtering model for completely cold item recommendations,” in Proceedings of the 13th ACM Conference on Recommender Systems, 2019, pp. 228–236.
- S. Kang, J. Hwang, D. Lee, and H. Yu, “Semi-supervised learning for cross-domain recommendation to cold-start users,” in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019, pp. 1563–1572.
- L. Meng, L. Chen, X. Yang, D. Tao, H. Zhang, C. Miao, and T.-S. Chua, “Learning using privileged information for food recognition,” in Proceedings of the 27th ACM International Conference on Multimedia, 2019, pp. 557–565.
- X. Gao, F. Feng, X. He, H. Huang, X. Guan, C. Feng, Z. Ming, and T.-S. Chua, “Hierarchical attention network for visually-aware food recommendation,” IEEE Transactions on Multimedia, vol. 22, no. 6, pp. 1647–1659, 2019.
- H. Chen, Y. Deng, Y. Li, T.-Y. Hung, and G. Lin, “Rgbd salient object detection via disentangled cross-modal fusion,” IEEE Transactions on Image Processing, vol. 29, pp. 8407–8416, 2020.
- J. Li, H. Ma, X. Li, Z. Qi, L. Meng, and X. Meng, “Unsupervised contrastive masking for visual haze classification,” in Proceedings of the 2022 International Conference on Multimedia Retrieval, 2022.
- X. Li, H. Ma, L. Meng, and X. Meng, “Comparative study of adversarial training methods for long-tailed classification,” in Proceedings of the 1st International Workshop on Adversarial Learning for Multimedia, 2021.
- X. Chen, L. Wu, M. He, L. Meng, and X. Meng, “Mlfont: Few-shot chinese font generation via deep meta-learning,” in Proceedings of the 2021 International Conference on Multimedia Retrieval, 2021, pp. 37–45.
- L. Wu, X. Chen, L. Meng, and X. Meng, “Multitask adversarial learning for chinese font style transfer,” in 2020 international joint conference on neural networks (IJCNN). IEEE, 2020, pp. 1–8.
- L. Meng, A.-H. Tan, and D. Xu, “Semi-supervised heterogeneous fusion for multimedia data co-clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 9, pp. 2293–2306, 2013.
- L. Meng, A.-H. Tan, and D. C. Wunsch, “Adaptive scaling of cluster boundaries for large-scale social media data clustering,” IEEE transactions on neural networks and learning systems, vol. 27, no. 12, pp. 2656–2669, 2015.
- L. Meng, A.-H. Tan, and C. Miao, “Salience-aware adaptive resonance theory for large-scale sparse data clustering,” Neural Networks, vol. 120, pp. 143–157, 2019.
- H. Fu, R. Wu, C. Liu, and J. Sun, “Mcen: Bridging cross-modal gap between cooking recipes and dish images with latent variable model,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 14 570–14 580.
- L. Meng, A.-H. Tan, D. C. Wunsch II, L. Meng, A.-H. Tan, and D. C. Wunsch II, “Online multimodal co-indexing and retrieval of social media data,” Adaptive Resonance Theory in Social Media Data Clustering: Roles, Methodologies, and Applications, pp. 155–174, 2019.
- L. Meng, F. Feng, X. He, X. Gao, and T.-S. Chua, “Heterogeneous fusion of semantic and collaborative information for visually-aware food recommendation,” in Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 3460–3468.
- X. Wu, S. Cetintas, D. Kong, M. Lu, J. Yang, and N. Chawla, “Learning from cross-modal behavior dynamics with graph-regularized neural contextual bandit,” in Proceedings of The Web Conference 2020, 2020, pp. 995–1005.
- Y. Zhu, J. Lin, S. He, B. Wang, Z. Guan, H. Liu, and D. Cai, “Addressing the item cold-start problem by attribute-driven active learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 4, pp. 631–644, 2019.
- X. Wang, X. He, Y. Cao, M. Liu, and T.-S. Chua, “Kgat: Knowledge graph attention network for recommendation,” in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2019, pp. 950–958.
- M. Jian, T. Jia, X. Yang, L. Wu, and L. Huo, “Cross-modal collaborative manifold propagation for image recommendation,” in Proceedings of the 2019 on International Conference on Multimedia Retrieval, 2019, pp. 344–348.
- Y. Lu, Y. Fang, and C. Shi, “Meta-learning on heterogeneous information networks for cold-start recommendation,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 1563–1573.
- J. Liu, J. Xiao, H. Ma, X. Li, Z. Qi, X. Meng, and L. Meng, “Prompt learning with cross-modal feature alignment for visual domain adaptation,” in Artificial Intelligence: Second CAAI International Conference, CICAI 2022, Beijing, China, August 27–28, 2022, Revised Selected Papers, Part I, 2022.
- Y. Wang, X. Li, H. Ma, Z. Qi, X. Meng, and L. Meng, “Causal inference with sample balancing for out-of-distribution detection in visual classification,” in Artificial Intelligence: Second CAAI International Conference, CICAI 2022, Beijing, China, August 27–28, 2022, Revised Selected Papers, Part I, 2022.
- C. Lin, S. Zhao, L. Meng, and T. Chua, “Multi-source domain adaptation for visual sentiment classification,” CoRR, vol. abs/2001.03886, 2020. [Online]. Available: https://arxiv.org/abs/2001.03886
- X. Li, L. Wu, X. Chen, L. Meng, and X. Meng, “Dse-net: Artistic font image synthesis via disentangled style encoding,” in 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2022, pp. 1–6.
- Y. Wang, X. Li, Z. Qi, J. Li, X. Li, X. Meng, and L. Meng, “Meta-causal feature learning for out-of-distribution generalization,” in European Conference on Computer Vision. Springer, 2023, pp. 530–545.
- W. Guo, Y. Zhang, X. Cai, L. Meng, J. Yang, and X. Yuan, “Ld-man: Layout-driven multimodal attention network for online news sentiment recognition,” Trans. Multi., vol. 23, p. 1785–1798, jan 2021. [Online]. Available: https://doi.org/10.1109/TMM.2020.3003648
- P. Dong, L. Wu, L. Meng, and X. Meng, “Hr-prgan: High-resolution story visualization with progressive generative adversarial networks,” Inf. Sci., vol. 614, no. C, p. 548–562, oct 2022. [Online]. Available: https://doi.org/10.1016/j.ins.2022.10.083
- P. Dong, L. Wu, L. Meng, and X. Meng., “Disentangled representations and hierarchical refinement of multi-granularity features for text-to-image synthesis,” in Proceedings of the 2022 International Conference on Multimedia Retrieval, 2022, pp. 268–276.
- J. Chen, H. Zhang, X. He, L. Nie, W. Liu, and T.-S. Chua, “Attentive collaborative filtering: Multimedia recommendation with item-and component-level attention,” in Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval, 2017, pp. 335–344.
- J. Tang, X. Du, X. He, F. Yuan, Q. Tian, and T.-S. Chua, “Adversarial training towards robust multimedia recommender system,” IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 5, pp. 855–867, 2019.
- Q. Liu, S. Wu, and L. Wang, “Deepstyle: Learning user preferences for visual recommendation,” in Proceedings of the 40th international acm sigir conference on research and development in information retrieval, 2017, pp. 841–844.
- J. Chen, H. Dong, X. Wang, F. Feng, M. Wang, and X. He, “Bias and debias in recommender system: A survey and future directions,” arXiv preprint arXiv:2010.03240, 2020.
- M. Volkovs, G. Yu, and T. Poutanen, “Dropoutnet: Addressing cold start in recommender systems,” Advances in neural information processing systems, vol. 30, 2017.
- X. He, Z. He, X. Du, and T.-S. Chua, “Adversarial personalized ranking for recommendation,” in The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018, pp. 355–364.
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., 1997.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529–533, 2015.
- Z. Zhou, L. Zhang, and N. Yang, “Contrastive collaborative filtering for cold-start item recommendation,” arXiv preprint arXiv:2302.02151, 2023.
- H. Ma, X. Li, L. Meng, and X. Meng, “Comparative study of adversarial training methods for cold-start recommendation,” in Proceedings of the 1st International Workshop on Adversarial Learning for Multimedia, 2021, pp. 28–34.
- Q. Li, L. Shen, S. Guo, and Z. Lai, “Wavelet integrated cnns for noise-robust image classification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 7245–7254.
- X. Chen, H. Chen, H. Xu, Y. Zhang, Y. Cao, Z. Qin, and H. Zha, “Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation,” in Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019, pp. 765–774.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.