Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media

Published 18 Jul 2023 in cs.CL, cs.LG, cs.MM, and cs.SI | (2307.09312v4)

Abstract: We present the Multi-Modal Discussion Transformer (mDT), a novel methodfor detecting hate speech in online social networks such as Reddit discussions. In contrast to traditional comment-only methods, our approach to labelling a comment as hate speech involves a holistic analysis of text and images grounded in the discussion context. This is done by leveraging graph transformers to capture the contextual relationships in the discussion surrounding a comment and grounding the interwoven fusion layers that combine text and image embeddings instead of processing modalities separately. To evaluate our work, we present a new dataset, HatefulDiscussions, comprising complete multi-modal discussions from multiple online communities on Reddit. We compare the performance of our model to baselines that only process individual comments and conduct extensive ablation studies.

Abstract PDF HTML Upgrade to Chat

References (34)

Citations (8)

View on Semantic Scholar

Summary

The paper introduces a Multi-Modal Discussion Transformer that integrates text, images, and graph transformers to enhance hate speech detection on platforms like Reddit.
It employs innovative modality fusion using shared bottlenecks and graph transformers with hierarchical spatial encoding to capture the complete discussion context.
Empirical results on the HatefulDiscussions dataset demonstrate improved precision, recall, and F1 scores compared to traditional text-only and context-aware models.

The paper "Multi-Modal Discussion Transformer: Integrating Text, Images, and Graph Transformers to Detect Hate Speech on Social Media" introduces a novel approach to the complex issue of hate speech detection on platforms like Reddit. The current challenges with traditional methods are that they often focus solely on textual content, neglecting the contextual richness offered by multi-modal data, including images and the broader conversational context. The proposed Multi-Modal Discussion Transformer (mDT) addresses these limitations by employing a holistic method that integrates multiple modalities within the discussion threads of social media platforms.

Methodological Innovation

The mDT architecture stands out for its comprehensive integration of text and image data through a graph transformer framework. This is achieved via several key components:

Modality Fusion: The paper presents an innovative mechanism utilizing shared modality bottlenecks, ensuring that both text and images compress information efficiently before fusing. This prevents the potential dilution of semantic content and optimizes multi-modal information exchange.
Graph Transformer: Unlike traditional models that process each comment in isolation, mDT employs graph transformers to embed comments within the broader context of an entire discussion. This strategy effectively grounds the meaning of each comment in its conversational environment.
Hierarchical Spatial Encoding: A novel feature of mDT is its hierarchical spatial encoding method, enhancing the model's ability to discern the structural relationships between comments in a tree-format discussion, such as those found on Reddit.
Dataset Contributions: An integral aspect of this research is the introduction of the HatefulDiscussions dataset, comprising annotated multi-modal discussions sourced from various Reddit communities. This dataset serves as a benchmark for evaluating the mDT's performance in a real-world setting.

Empirical Assessment

The experiments conducted demonstrate the superiority of mDT over existing hate speech detection methods. Compared against text-only models like BERT-HateXplain and RoBERTa Dynabench, mDT showed improved precision, recall, and F1 scores, highlighting the efficacy of integrating images and contextual graph structures. Notably, mDT delivers a substantial improvement in F1 score over previous context-aware models such as the text-only Graphormer.

Implications and Future Directions

The integration of multi-modal data offers promising advancements in content moderation across online platforms. By considering both images and the interconnected nature of online discussions, the mDT enhances the nuanced detection of hate speech, which can vary significantly based on context. This capability not only mitigates false positives but also provides a tool for understanding community-specific language and behavior, such as reclaimed vernacular in marginalized communities.

Future work could extend this model to other platforms and explore additional modalities such as audio or video content. There is also potential for enriching models through named entity recognition, further enhancing contextual and real-world understanding. As AI continues to evolve, the framework proposed by mDT may aid in developing robust solutions that encapsulate the dynamic nature of digital communications.

In summary, the Multi-Modal Discussion Transformer presents a significant step forward in hate speech detection, directly addressing the constraints of past models by embracing the multi-faceted and interconnected nature of online discussions. This research not only contributes to computational methods but also holds potential social value in promoting healthier, inclusive online communities.