What is a Social Media Bot? A Global Comparison of Bot and Human Characteristics

Published 1 Jan 2025 in cs.CY, cs.AI, and cs.SI | (2501.00855v2)

Abstract: Chatter on social media is 20% bots and 80% humans. Chatter by bots and humans is consistently different: bots tend to use linguistic cues that can be easily automated while humans use cues that require dialogue understanding. Bots use words that match the identities they choose to present, while humans may send messages that are not related to the identities they present. Bots and humans differ in their communication structure: sampled bots have a star interaction structure, while sampled humans have a hierarchical structure. These conclusions are based on a large-scale analysis of social media tweets across ~200mil users across 7 events. Social media bots took the world by storm when social-cybersecurity researchers realized that social media users not only consisted of humans but also of artificial agents called bots. These bots wreck havoc online by spreading disinformation and manipulating narratives. Most research on bots are based on special-purposed definitions, mostly predicated on the event studied. This article first begins by asking, "What is a bot?", and we study the underlying principles of how bots are different from humans. We develop a first-principle definition of a social media bot. With this definition as a premise, we systematically compare characteristics between bots and humans across global events, and reflect on how the software-programmed bot is an Artificial Intelligent algorithm, and its potential for evolution as technology advances. Based on our results, we provide recommendations for the use and regulation of bots. Finally, we discuss open challenges and future directions: Detect, to systematically identify these automated and potentially evolving bots; Differentiate, to evaluate the goodness of the bot in terms of their content postings and relationship interactions; Disrupt, to moderate the impact of malicious bots.

Abstract PDF Upgrade to Chat

Summary

The paper refines the definition of social media bots and, using the BotHunter algorithm, reveals that about 20% of users are automated.
The paper finds that bots use repetitive linguistic cues like excessive hashtags and mentions, contrasting with human users' varied content.
The paper illustrates that bots form star network structures for rapid dissemination, highlighting implications for regulatory strategies.

The paper "What is a Social Media Bot? A Global Comparison of Bot and Human Characteristics" by Lynnette Hui Xian Ng and Kathleen M. Carley provides a comprehensive examination of the fundamental differences between social media bots and human users. This research is crucial in understanding the mechanics and roles of bots on social media platforms, especially in contexts where they are used to influence public discourse. The study employs a large dataset derived from numerous global events with over 200 million users and focuses on differentiating bots from humans through linguistic, identity, and network analysis.

The study begins with establishing a refined definition of social media bots built on the core components of social media platforms—users, content, and relationships. The authors define a social media bot as an automated account that engages in content creation, distribution, and relationship dynamics on social media. This definition is constructed to be independent of the value judgments associated with the bots' actions, allowing for a more objective analysis of their characteristics.

Methodological Approach

The study employs the BotHunter algorithm to classify users within the dataset as either humans or bots. It then explores the differences between these two entities across several dimensions:

Volume Analysis: The research reveals that roughly 20% of the users are bots, a finding that aligns with Elon Musk's assertion regarding the prevalence of bots on Twitter. Notably, an increase in bot activity is observed during politically charged events, such as the US elections, indicating bots' targeted use in such contexts.
Linguistic Features: Bots tend to use more hashtags and mentions, creating a high volume of tweets with heightened levels of abusive terms, albeit conducive to easier automation. In contrast, humans exhibit a more diverse linguistic profile, integrating positive sentiment and media elements that indicate higher cognitive processing.
Self-Presentation of Identity: The study finds that bots often declare fewer identities compared to humans and are more likely to use profiles aligning with socio-political or racial/national stereotypes. This emphasizes the strategic design of bots to portray specific images to influence targeted audiences.
Social Interaction Patterns: Bots typically exhibit a star network structure characterized by a dense interaction with fewer intermediaries, enhancing their ability to swiftly disseminate information. On the other hand, human networks show a hierarchical tree structure which, despite being less efficient for mass communication, facilitates deeper engagement through replies and quotes.

Implications for Bot Utilization and Regulation

The authors suggest several recommendations for leveraging and regulating bots:

Amplification and Announcement: Bots can be valuable for tasks requiring high-frequency postings without human-like cognitive requirements, such as announcing alerts or distributing information during crises.
Human-Bot Interaction: Given the bots' capabilities to mimic specific identities and engage with human users, the creation of bots for educational or support purposes—where their structured interactions are beneficial—could be explored.
Regulatory Focus: The paper highlights the necessity for frameworks focusing on restricting bots' use of harmful language, thereby reducing the proliferation of toxic online content.

Future Directions and Challenges

Ng and Carley emphasize the need for improved bot detection techniques that can adapt to the rapidly evolving nature of bots, especially with advancements in AI which facilitate more sophisticated bot behaviors. They propose a tripartite framework—Detect, Differentiate, and Disrupt—as a guiding principle for future research to systematically identify, evaluate, and manage the influence of bots.

In conclusion, this research contributes to the nuanced understanding of social media bots and posits pathways to harness their potential for positive applications while mitigating their misuse. The insights provided establish a clearer demarcation between bot-driven and human-driven activities in digital discourse, forming a foundation for subsequent scholarly and regulatory efforts in social cybersecurity.

Markdown Report Issue