VIBES: Exploring Viewer Spatial Interactions as Direct Input for Livestreamed Content

Published 12 Apr 2025 in cs.HC | (2504.09016v1)

Abstract: Livestreaming has rapidly become a popular online pastime, with real-time interaction between streamer and viewer being a key motivating feature. However, viewers have traditionally had limited opportunity to directly influence the streamed content; even when such interactions are possible, it has been reliant on text-based chat. We investigate the potential of spatial interaction on the livestreamed video content as a form of direct, real-time input for livestreamed applications. We developed VIBES, a flexible digital system that registers viewers' mouse interactions on the streamed video, i.e., clicks or movements, and transmits it directly into the streamed application. We used VIBES as a technology probe; first designing possible demonstrative interactions and using these interactions to explore streamers' perception of viewer influence and possible challenges and opportunities. We then deployed applications built using VIBES in two livestreams to explore its effects on audience engagement and investigate their relationships with the stream, the streamer, and fellow audience members. The use of spatial interactions enhances engagement and participation and opens up new avenues for both streamer-viewer and viewer-viewer participation. We contextualize our findings around a broader understanding of motivations and engagement in livestreaming, and we propose design guidelines and extensions for future research.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces VIBES, a system that captures viewer mouse events to enable direct, real-time interaction with livestreamed content.
It details an architecture integrating browser extensions, Twitch modules, a websocket server, and a streamer application for bi-directional communication.
Deployment studies show enhanced viewer engagement while highlighting challenges such as latency and moderation of spatial inputs.

VIBES: Exploring Viewer Spatial Interactions as Direct Input for Livestreamed Content

Introduction

The emergence of livestreaming platforms such as Twitch and YouTube Live has revolutionized how content is consumed online, creating a dynamic and interactive environment whereby streamers can engage with global audiences in real-time. Despite the interactive nature of these platforms, viewer participation has largely been mediated through text-based communication channels, limiting the expressiveness and immediacy of viewer influences on the streamed content. "VIBES: Exploring Viewer Spatial Interactions as Direct Input for Livestreamed Content" addresses this limitation by introducing a novel system that leverages spatial interaction—specifically mouse events—captured directly from viewers to influence livestreamed content in real-time.

VIBES Architecture and Implementation

VIBES is structured around four core components that enable viewer interaction through mouse events:

Browser Extension: This captures mouse clicks and movements on the video player, retrieving critical information such as type and spatial coordinates relative to the streamed video content.
Twitch Extension: Embedded as an iframe, it augments the mouse events with additional contextual data, allowing viewers to pass enriched information via standard web interactions.
Websocket Server: Hosted in the cloud, this server receives data from client-side interactions and forwards it to the streamer's application, ensuring translation from viewer action to application input.
Streamer Application: Capable of processing the received data to enact changes within the livestreamed application, optionally sending data back to viewers to adapt the Twitch extension dynamically.
Figure 1: Simplified architecture diagram of VIBES. The browser extension captures a viewer's mouse input; the Twitch extension provides added expressiveness to the spatial data.

The bi-directional nature of VIBES's architecture allows for versatile interaction dynamics, facilitating novel viewer participation modes.

Demonstrative Interactions

The potential of VIBES is exemplified through demonstrative applications designed to showcase its capabilities:

Collaborative Canvas: Viewers can collaboratively draw on a shared canvas with the streamer, utilizing the Twitch extension for color selection and real-time input visualization.
Figure 2: Viewers collaboratively drawing on a shared canvas and spawning enemies in a game using VIBES.
Game Augmentations: Viewers can spawn enemies or items within the streamer's game, influencing gameplay directly and dynamically.

These interactions highlight how spatial input can enrich viewer engagement beyond text-based communications, fostering collaborative and competitive environments.

Formative Exploration with Streamers

To probe the practical implications of VIBES, the study conducted interviews with streamers. Key findings demonstrate that VIBES enhances audience engagement, allowing viewers to partake in the streamed experience more actively. Streamers highlighted how spatial interaction fosters community building and redefines the viewer-streamer relationship by balancing control between both parties.

As a consideration, the influence of spatial input on streamer autonomy emerged as both a challenge and an opportunity. Streamer concerns regarding potential misalignment between viewer intent and content intended by the streamer suggest a need for moderation mechanisms when scaling interactions.

Viewer Application Investigation

Subsequent deployment studies with audience participation revealed that VIBES significantly increased feelings of enjoyment and engagement among viewers. Metrics such as social engagement, participation, and mood were positively impacted, affirming that spatial interaction enhances the qualitative aspects of livestreaming experiences compared to traditional chat systems.

The study also underscored challenges such as latency, underscoring the critical need for responsive feedback mechanisms that maintain viewer agency and mitigate the impact of time-shifted perspectives.

Figure 3: Images from the livestream session showing viewer interactions such as trap placement and game upgrades.

Discussion

The practical deployment of VIBES as a technology probe demonstrates the transformative potential of spatial interactions in livestreaming environments. It opens avenues for both collaborative and adversarial engagement scenarios, bridging the gap between spectator and participant roles in live content.

Future research directions include refining latency handling, expanding integration with diverse content platforms beyond gaming, and developing robust moderation systems to manage interactions at scale. VIBES contributes to the broader discourse on enhancing viewer expressiveness and engagement through innovative input modalities.

Conclusion

VIBES represents a significant advancement in the livestreaming ecosystem, transforming passive viewers into active participants capable of influencing content in real-time. While current implementations focus on gaming contexts, the principles underlying VIBES can extend to various domains, including education and cultural exchange. By leveraging the innate expressiveness of spatial interactions, VIBES paves the way for richer, more inclusive interactive media experiences.

Markdown Report Issue