Bans vs. Warning Labels: Examining Bystanders' Support for Community-wide Moderation Interventions

Published 21 Jul 2023 in cs.HC | (2307.11880v3)

Abstract: Social media platforms like Facebook and Reddit host thousands of user-governed online communities. These platforms sanction communities that frequently violate platform policies; however, public perceptions of such sanctions remain unclear. In a pre-registered survey conducted in the US, I explore bystander perceptions of content moderation for communities that frequently feature hate speech, violent content, and sexually explicit content. Two community-wide moderation interventions are tested: (1) community bans, where all community posts are removed, and (2) community warning labels, where an interstitial warning label precedes access. I examine how third-person effects and support for free speech influence user approval of these interventions on any platform. My regression analyses show that presumed effects on others are a significant predictor of backing for both interventions, while free speech beliefs significantly influence participants' inclination for using warning labels. Analyzing the open-ended responses, I find that community-wide bans are often perceived as too coarse, and users instead value sanctions in proportion to the severity and type of infractions. I report on concerns that norm-violating communities could reinforce inappropriate behaviors and show how users' choice of sanctions is influenced by their perceived effectiveness. I discuss the implications of these results for HCI research on online harms and content moderation.