Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge

Published 29 Aug 2024 in cs.CL and cs.AI | (2408.16749v1)

Abstract: The United States has experienced a significant increase in violent extremism, prompting the need for automated tools to detect and limit the spread of extremist ideology online. This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collected social media posts containing "far-right" and "far-left" ideological keywords and manually labeled them as extremist or non-extremist. Extremist posts were further classified into one or more of five contributing elements of extremism based on a working definitional framework. The BERT model's performance was evaluated based on training data size and knowledge transfer between categories. We also compared the performance of GPT 3.5 and GPT 4 models using different prompts: na\"ive, layperson-definition, role-playing, and professional-definition. Results showed that the best performing GPT models outperformed the best performing BERT models, with more detailed prompts generally yielding better results. However, overly complex prompts may impair performance. Different versions of GPT have unique sensitives to what they consider extremist. GPT 3.5 performed better at classifying far-left extremist posts, while GPT 4 performed better at classifying far-right extremist posts. LLMs, represented by GPT models, hold significant potential for online extremism classification tasks, surpassing traditional BERT models in a zero-shot setting. Future research should explore human-computer interactions in optimizing GPT models for extremist detection and classification tasks to develop more efficient (e.g., quicker, less effort) and effective (e.g., fewer errors or mistakes) methods for identifying extremist content.

Abstract PDF HTML Upgrade to Chat

Summary

The paper demonstrates that GPT models outperform BERT in classifying extremist content, especially with sparse training data.
The study utilizes both binary and multi-class classification tasks to evaluate the impact of diverse prompt engineering techniques on model performance.
The paper finds version-specific strengths, with GPT 3.5 excelling in far-left contexts and GPT 4 in far-right scenarios, underlining tailored application potentials.

Evaluating LLMs for Detecting Online Extremism

The paper "Assessing LLMs for Online Extremism Research: Identification, Explanation, and New Knowledge" by Beidi Dong et al. provides an analytical comparison of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) models in classifying online domestic extremist content. Utilizing a dataset derived from social media posts containing ideologically charged keywords, this study seeks to discern the efficacy of these models in distinguishing extremist from non-extremist content and categorizing specific elements of extremism. The research aims to address the rising challenge of online extremism in the United States, as violent extremist incidents continue to increase.

Methodology

The study investigates the performance of BERT and GPT models using distinct methodologies. While BERT operates within a supervised learning paradigm, relying heavily on labeled training data, GPT models employ a zero-shot approach, drawing on pre-trained knowledge to execute classification tasks without explicit labeled input. The binary classification task involves distinguishing extremist posts from non-extremist content, while the multi-class classification task aims to identify specific elements within extremist content, such as direct threats, advocacy of violence, or propagation of prejudice.

The research also examines the impact of various prompt engineering techniques on GPT models, including prompts built on simple instructions, layperson definitions, role-playing scenarios, and comprehensive professional definitions. This exploration provides insights into the optimal formulation of prompts to achieve maximal model performance.

Results

The findings reveal that while BERT models show competent performance with a substantial volume of training data, their efficacy is compromised under a paucity of data, underscoring their dependency on substantial and high-quality labeled input. This shortcoming is pronounced when BERT models attempt novel content classification tasks lacking sufficient training data. Conversely, GPT models display robust capabilities in conducting both binary and multi-class classification tasks without requiring specific task-oriented training data. Notably, GPT models surpass BERT in classifying extremist content, demonstrating their potential for application in real-world settings where labeled data is often sparse.

The study observes variability in performance based on prompt engineering, identifying that detail-rich prompts—particularly those assigning role-based instructions—typically result in improved classification outcomes. However, the research also highlights potential drawbacks in over-detailed prompts, which may overwhelm LLMs and impair accuracy, suggesting a nuanced balance of guidance is optimal.

Furthermore, the paper identifies version-specific sensitivities in GPT models: GPT 3.5 excels in far-left extremist post classification, whereas GPT 4 significantly improves performance in far-right extremist contexts. This difference underscores the potential need for leveraging specific LLM versions dependent on the ideological focus of the analysis.

Implications and Future Directions

This research elucidates the comparative strengths of LLMs like GPT over traditional models such as BERT in online extremism classification, emphasizing the potential of LLMs in augmenting automated detection tools. These findings hold importance for practitioners seeking efficient, rapid solutions to identify extremist content amid the voluminous noise of social media spaces.

The study suggests that future explorations should further investigate the integration of context-aware prompt engineering, refining formulations to balance clarity and depth, ensuring maximum alignment with classification objectives. Additionally, a deeper understanding of version-based sensitivities in LLMs could yield enhancements tailored to specific ideological spectrums of extremism prevalent on different platforms.

In conclusion, Dong et al.'s work advances the field of online extremism detection and classification, demonstrating the transformative potential of LLMs in addressing complex linguistic and ideologically varied content without the constraints imposed by traditional supervised learning frameworks. Their research lays groundwork for developing enhanced, operationalizable AI models capable of effectively curbing the influence of online extremist ideologies.

Markdown Report Issue