Papers
Topics
Authors
Recent
Search
2000 character limit reached

"I'm in the Bluesky Tonight": Insights from a Year Worth of Social Data

Published 29 Apr 2024 in cs.SI and cs.CY | (2404.18984v1)

Abstract: Pollution of online social spaces caused by rampaging d/misinformation is a growing societal concern. However, recent decisions to reduce access to social media APIs are causing a shortage of publicly available, recent, social media data, thus hindering the advancement of computational social science as a whole. We present a large, high-coverage dataset of social interactions and user-generated content from Bluesky Social to address this pressing issue. The dataset contains the complete post history of over 4M users (81% of all registered accounts), totalling 235M posts. We also make available social data covering follow, comment, repost, and quote interactions. Since Bluesky allows users to create and bookmark feed generators (i.e., content recommendation algorithms), we also release the full output of several popular algorithms available on the platform, along with their timestamped ``like'' interactions and time of bookmarking. This dataset allows unprecedented analysis of online behavior and human-machine engagement patterns. Notably, it provides ground-truth data for studying the effects of content exposure and self-selection and performing content virality and diffusion analysis.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (11)
  1. Kleppmann, M. et al. Bluesky and the at protocol: Usable decentralized social media. arXiv preprint arXiv:2402.03239 (2024).
  2. @bsky.app’s post published feb 22, 2024 at 21:04. https://bsky.app/profile/bsky.app/post/3klzrudt4uk2z. [Accessed 27-03-2024].
  3. Jeong, U. et al. User migration across multiple social media platforms. arXiv preprint arXiv:2309.12613 (2023).
  4. Introducing Threads: A New Way to Share With Text | Meta — about.fb.com. https://about.fb.com/news/2023/07/introducing-threads-new-app-text-sharing/. [Accessed 29-04-2024].
  5. Bluesky — news feed. https://bsky.app/profile/did:plc:kkf4naxqmweop7dv4l2iqqf5/feed/verified-news. [Accessed 29-04-2024].
  6. Bluesky — cat pics feed. https://bsky.app/profile/did:plc:q6gjnaw2blty4crticxkmujt/feed/cv:cat. [Accessed 29-04-2024].
  7. Bluesky Documentation. https://docs.bsky.app/. [Accessed 27-03-2024].
  8. Bluesky App Privacy Policy. https://bsky.social/about/support/privacy-policy. [Accessed 27-03-2024].
  9. Grootendorst, M. Bertopic: Neural topic modeling with a class-based tf-idf procedure. arXiv preprint arXiv:2203.05794 (2022).
  10. Rahman, M. F. et al. Hdbscan: Density based clustering over location based services. arXiv preprint arXiv:1602.03730 (2016).
  11. Sung, M. Bluesky is under fire for allowing usernames with racial slurs | TechCrunch — techcrunch.com. https://techcrunch.com/2023/07/17/bluesky-racial-slurs-banned-list-usernames/. [Accessed 24-04-2024].
Citations (7)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 6 tweets with 6 likes about this paper.