A Survey of Multimodal Sarcasm Detection
Abstract: Sarcasm is a rhetorical device that is used to convey the opposite of the literal meaning of an utterance. Sarcasm is widely used on social media and other forms of computer-mediated communication motivating the use of computational models to identify it automatically. While the clear majority of approaches to sarcasm detection have been carried out on text only, sarcasm detection often requires additional information present in tonality, facial expression, and contextual images. This has led to the introduction of multimodal models, opening the possibility to detect sarcasm in multiple modalities such as audio, images, text, and video. In this paper, we present the first comprehensive survey on multimodal sarcasm detection - henceforth MSD - to date. We survey papers published between 2018 and 2023 on the topic, and discuss the models and datasets used for this task. We also present future research directions in MSD.
- Multimodal sarcasm recognition by fusing textual, visual and acoustic content via multi-headed attention for video dataset. In WCONF, 2023.
- ¡Qué maravilla! multimodal sarcasm detection in Spanish: a dataset and a baseline. In NAACL, 2021.
- Text-based sarcasm detection on social networks: A systematic review. IJACSA, 14(3), 2023.
- Bottom-up and top-down attention for image captioning and visual question answering. In CVPR, 2018.
- An ensemble of humour, sarcasm, and hate speech for sentiment classification in online reviews. In W-NUT 2019, November 2019.
- wav2vec 2.0: A framework for self-supervised learning of speech representations. NeurIPS, 2020.
- Sarcasm in sight and sound: Benchmarking and expansion to improve multimodal sarcasm detection. arXiv preprint arXiv:2310.01430, 2023.
- Multi-modal sarcasm detection in Twitter with hierarchical fusion model. In ACL, 2019.
- Towards multimodal sarcasm detection (an _obviously_ perfect paper). In ACL, 2019.
- Literature survey of sarcasm detection. In WiSPNET, 2017.
- Sentiment and emotion help sarcasm? a multi-task learning framework for multi-modal sarcasm, sentiment and emotion analysis. In ACL, 2020.
- An emoji-aware multitask framework for multimodal sarcasm detection. Knowledge-Based Systems, 257:109924, 2022.
- Sarcasm detection on facebook: A supervised learning approach. In ICMI, 2018.
- Sarcasm detection on flickr using a cnn. In ICCBD, 2018.
- Nice perfume. how long did you marinate in it? multimodal sarcasm explanation. In AAAI, 2022.
- Multi-modal sarcasm detection with prompt-tuning. In ACAIT, 2022.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- Opensmile: The munich versatile and fast open-source audio feature extractor. In ACM MM, 2010.
- Simona Frenda. The role of sarcasm in hate speech. a multilingual perspective. In SEPLN, 2018.
- Hybrid attention based multimodal network for spoken language classification. In ACL, 2018.
- CASCADE: Contextual sarcasm detection in online discussion forums. In COLING, 2018.
- How do cultural differences impact the quality of sarcasm annotation?: A case study of Indian annotators and American text. In SIGHUM, 2016.
- Automatic sarcasm detection: A survey. ACM Computing Surveys (CSUR), 50(5):1–22, 2017.
- A large self-annotated corpus for sarcasm. In LREC, 2018.
- Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557, 2019.
- Multi-modal sarcasm detection with interactive in-modal and cross-modal graphs. In ACM MM, 2021.
- Multi-modal sarcasm detection via cross-modal graph convolutional network. In ACL, 2022.
- Goat-bench: Safety insights to large multimodal models through meme-based social abuse. arXiv preprint arXiv:2401.01523, 2024.
- Towards multi-modal sarcasm detection via hierarchical congruity modeling with knowledge enhancement. In EMNLP, 2022.
- A dual-channel framework for sarcasm recognition by detecting sentiment conflict. In NAACL, 2022.
- Vilbert: Pretraining task-agnostic visiolinguistic representations for v-l tasks. In NeurIPS, 2019.
- Who cares about sarcastic tweets? investigating the impact of sarcasm on sentiment analysis. In LREC, 2014.
- Semeval 2021 task 7: Hahackathon, detecting and rating humor and offense. In SemEval, 2021.
- Predicting readers’ sarcasm understandability by modeling gaze behavior. In AAAI, 2016.
- iSarcasm: A dataset of intended sarcasm. In ACL, 2020.
- Modeling intra and inter-modality incongruity for multi-modal sarcasm detection. In EMNLP, 2020.
- MELD: A multimodal multi-party dataset for emotion recognition in conversations. In ACL, 2019.
- Multimodal learning using optimal transport for sarcasm and humor detection. In WACV, 2022.
- MMSD2.0: Towards a reliable multi-modal sarcasm detection system. In ACL, 2023.
- Learning transferable visual models from natural language supervision. In ICML, 2021.
- Finetuned clip models are efficient video learners. In CVPR, 2023.
- A multimodal corpus for emotion recognition in sarcasm. In LREC, 2022.
- Change in humor and sarcasm use based on anxiety and depression symptom severity during the covid-19 pandemic. Journal of psychiatric research, 140:95–100, 2021.
- Yalamanchili Salini and J. HariKiran. Sarcasm detection: A systematic review of methods and approaches. In ICSMDI, 2023.
- I didn’t mean what i wrote! exploring multimodality for sarcasm detection. In IJCNN, 2020.
- Detecting sarcasm in multimodal social platforms. In ACM MM, 2016.
- Lxmert: Learning cross-modality encoder representations from transformers. In EMNLP-IJCNLP, 2019.
- Dynamic routing transformer network for multimodal sarcasm detection. In ACL, 2023.
- Predict and use: Harnessing predicted gaze to improve multimodal sarcasm detection. In EMNLP, 2023.
- Techniques of sarcasm detection: A review. In ICACITE, 2021.
- Building a bridge: A method for image-text sarcasm detection without pretraining on image-text data. In NLPBT, 2020.
- Vicarious offense and noise audit of offensive speech classifiers: Unifying human and machine disagreement on what is offensive. In EMNLP, 2023.
- Modeling incongruity between modalities for multimodal sarcasm detection. IEEE MultiMedia, 28(2):86–95, 2021.
- Reasoning with multimodal sarcastic tweets via modeling cross-modality contrast and semantic association. In ACL, 2020.
- Mm-bigbench: Evaluating multimodal models on multimodal content comprehension tasks. arXiv preprint arXiv:2310.09036, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.