Papers
Topics
Authors
Recent
Search
2000 character limit reached

TopicDiff: A Topic-enriched Diffusion Approach for Multimodal Conversational Emotion Detection

Published 4 Mar 2024 in cs.CL, cs.AI, and cs.LG | (2403.04789v2)

Abstract: Multimodal Conversational Emotion (MCE) detection, generally spanning across the acoustic, vision and language modalities, has attracted increasing interest in the multimedia community. Previous studies predominantly focus on learning contextual information in conversations with only a few considering the topic information in single language modality, while always neglecting the acoustic and vision topic information. On this basis, we propose a model-agnostic Topic-enriched Diffusion (TopicDiff) approach for capturing multimodal topic information in MCE tasks. Particularly, we integrate the diffusion model into neural topic model to alleviate the diversity deficiency problem of neural topic model in capturing topic information. Detailed evaluations demonstrate the significant improvements of TopicDiff over the state-of-the-art MCE baselines, justifying the importance of multimodal topic information to MCE and the effectiveness of TopicDiff in capturing such information. Furthermore, we observe an interesting finding that the topic information in acoustic and vision is more discriminative and robust compared to the language.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Multimodal topic-enriched auxiliary learning for depression detection. In Proceedings of COLING 2020.
  2. Label-efficient semantic segmentation with diffusion models. In Proceedings of ICLR 2022.
  3. IEMOCAP: interactive emotional dyadic motion capture database. Lang. Resour. Evaluation, 42(4):335–359.
  4. Observing dialogue in therapy: Categorizing and forecasting behavioral codes. In Proceedings of ACL 2019, pages 5599–5611.
  5. A discrete CVAE for response generation on short-text conversation. In Proceedings of EMNLP 2019, pages 1898–1908.
  6. COSMIC: commonsense knowledge for emotion identification in conversations. In Findings of EMNLP 2020.
  7. Dialoguegcn: A graph convolutional neural network for emotion recognition in conversation. In Proceedings of EMNLP 2019.
  8. Neural topic model with reinforcement learning. In Proceedings of EMNLP 2019.
  9. ICON: interactive conversational memory network for multimodal emotion detection. In Proceedings of EMNLP 2018.
  10. Conversational memory network for emotion recognition in dyadic dialogue videos. In Proceedings of NAACL 2018.
  11. Masked autoencoders are scalable vision learners. In Proceedings of CVPR 2022, pages 15979–15988.
  12. Denoising diffusion probabilistic models. In Proceedings of NeurIPS 2020.
  13. MM-DFN: multimodal dynamic fusion network for emotion recognition in conversations. In Proceedings of ICASSP 2022, pages 7037–7041.
  14. Dialoguecrn: Contextual reasoning networks for emotion recognition in conversations. In Proceedings of ACL 2021.
  15. MMGCN: multimodal fusion via deep graph convolution network for emotion recognition in conversation. In Proceedings of ACL 2021.
  16. Collaborative diffusion for multi-modal face generation and editing. In Proceedings of CVPR 2023.
  17. Relation-aware graph attention networks with relational position encodings for emotion recognition in conversations. In Proceedings of EMNLP 2020.
  18. Higru: Hierarchical gated recurrent units for utterance-level emotion recognition. In Proceedings of NAACL 2019, pages 397–406.
  19. Neural attention-aware hierarchical topic model. In Proceedings of EMNLP 2021.
  20. COGMEN: contextualized GNN based multimodal emotion recognition. In Proceedings of NAACL 2022, pages 4148–4164.
  21. Diederik P. Kingma and Max Welling. 2014. Auto-encoding variational bayes. In Proceedings of ICLR 2014.
  22. Past, present, and future: Conversational emotion recognition through structural modeling of psychological knowledge. In Findings of EMNLP 2021.
  23. Gcnet: Graph completion network for incomplete multimodal learning in conversation. IEEE Trans. Pattern Anal. Mach. Intell., 45(7):8419–8432.
  24. Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692.
  25. Dialoguernn: An attentive RNN for emotion detection in conversations. In Proceedings of AAAI 2019.
  26. Neural variational inference for text processing. In Proceedings of ICML 2016, volume 48, pages 1727–1736.
  27. Context-dependent sentiment analysis in user-generated videos. In Proceedings of ACL 2017.
  28. MELD: A multimodal multi-party dataset for emotion recognition in conversations. In Proceedings of ACL 2019.
  29. Multimodal topic learning for video recommendation. CoRR, abs/2010.13373.
  30. High-resolution image synthesis with latent diffusion models. In Proceedings of CVPR 2022, pages 10674–10685.
  31. Mm-diffusion: Learning multi-modal diffusion models for joint audio and video generation. In Proceedings of CVPR 2023.
  32. Dialogxl: All-in-one xlnet for multi-party conversation emotion recognition. In Proceedings of AAAI 2021.
  33. Yang Song and Stefano Ermon. 2019. Generative modeling by estimating gradients of the data distribution. In Proceedings of NeurIPS 2019.
  34. Score-based generative modeling in latent space. In Proceedings of NeurIPS 2021.
  35. Sentiment classification in customer service dialogue with topic-aware multi-task learning. In Proceedings of AAAI 2020.
  36. Neural topic modeling with bidirectional adversarial training. In Proceedings of ACL 2020.
  37. Diffusion models: A comprehensive survey of methods and applications. CoRR, abs/2209.00796.
  38. M3ED: multi-modal multi-scene multi-label emotional dialogue database. In Proceedings of ACL 2022.
  39. Mucdn: Mutual conversational detachment network for emotion recognition in multi-party conversations. In Proceedings of COLING 2022.
  40. Topic-driven and knowledge-aware transformer for dialogue emotion detection. In Proceedings of ACL 2021.

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.