Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge

Authors: Beidi Dong, Jin R. Lee, Ziwei Zhu, Balassubramanian Srinivasan

Abstract: The United States has experienced a significant increase in violent
extremism, prompting the need for automated tools to detect and limit the
spread of extremist ideology online. This study evaluates the performance of
Bidirectional Encoder Representations from Transformers (BERT) and Generative
Pre-Trained Transformers (GPT) in detecting and classifying online domestic
extremist posts. We collected social media posts containing “far-right” and
“far-left” ideological keywords and manually labeled them as extremist or
non-extremist. Extremist posts were further classified into one or more of five
contributing elements of extremism based on a working definitional framework.
The BERT model’s performance was evaluated based on training data size and
knowledge transfer between categories. We also compared the performance of GPT
3.5 and GPT 4 models using different prompts: na\”ive, layperson-definition,
role-playing, and professional-definition. Results showed that the best
performing GPT models outperformed the best performing BERT models, with more
detailed prompts generally yielding better results. However, overly complex
prompts may impair performance. Different versions of GPT have unique
sensitives to what they consider extremist. GPT 3.5 performed better at
classifying far-left extremist posts, while GPT 4 performed better at
classifying far-right extremist posts. Large language models, represented by
GPT models, hold significant potential for online extremism classification
tasks, surpassing traditional BERT models in a zero-shot setting. Future
research should explore human-computer interactions in optimizing GPT models
for extremist detection and classification tasks to develop more efficient
(e.g., quicker, less effort) and effective (e.g., fewer errors or mistakes)
methods for identifying extremist content.