The role of interface design on prompt-mediated creativity in Generative AI

Published 30 Nov 2023 in cs.CY, cs.HC, and physics.soc-ph | (2312.00233v2)

Abstract: Generative AI for the creation of images is becoming a staple in the toolkit of digital artists and visual designers. The interaction with these systems is mediated by \emph{prompting}, a process in which users write a short text to describe the desired image's content and style. The study of prompts offers an unprecedented opportunity to gain insight into the process of human creativity. Yet, our understanding of how people use them remains limited. We analyze more than 145,000 prompts from the logs of two Generative AI platforms (Stable Diffusion and Pick-a-Pic) to shed light on how people \emph{explore} new concepts over time, and how their exploration might be influenced by different design choices in human-computer interfaces to Generative AI. We find that users exhibit a tendency towards exploration of new topics over exploitation of concepts visited previously. However, a comparative analysis of the two platforms, which differ both in scope and functionalities, reveals some stark differences. Features diverting user focus from prompting and providing instead shortcuts for quickly generating image variants are associated with a considerable reduction in both exploration of novel concepts and detail in the submitted prompts. These results carry direct implications for the design of human interfaces to Generative AI and raise new questions regarding how the process of prompting should be aided in ways that best support creativity.

Abstract PDF Upgrade to Chat

Citations (4)

View on Semantic Scholar

Summary

The paper demonstrates that interface design significantly influences creative prompt evolution and exploration, as evidenced by differences between Stable Diffusion and Pick-a-Pic.
Methodologically, the study employs text (Jaccard index) and image (DinoV2 embeddings) similarity analyses to quantify user behavior across 145,000+ prompts.
Results show that interfaces enabling rapid variant generation without new prompts reduce prompt uniqueness and exploration, impacting overall user creativity.

The Role of Interface Design on Prompt-Mediated Creativity in Generative AI

Introduction

This paper presents an in-depth analysis of the impact of interface design on creativity facilitated by generative AI tools, specifically focusing on image generation through text prompts. The researchers analyzed over 145,000 prompts from two generative AI platforms, Stable Diffusion and Pick-a-Pic, to explore how users engage in creativity through prompting and how different interface features affect this process. The study highlights a tendency among users towards exploration of new topics as opposed to exploiting previously visited concepts. However, certain interface features that simplify the generation of image variants without the need for new prompts can significantly reduce both exploration and the level of detail in user-submitted prompts.

Figure 1: User interface of Pick-a-Pic (left) and Stable Diffusion (right).

Dataset

The study examines two platforms with distinct interface designs. Stable Diffusion involves users submitting prompts via discord, while Pick-a-Pic pairs image variants for user selection, allowing repetitive submissions without prompt modifications. The dataset for Stable Diffusion includes 1.5 million prompts collected from its Discord server, while Pick-a-Pic's dataset comprises almost 78,000 prompts. Both datasets capture essential user interaction information, such as prompt sequences, to analyze exploration versus exploitation behavior.

Methods

The paper employs text and image similarity analyses to quantify exploration and exploitation in user interactions. Textual prompts are compared using the Jaccard index, while image similarity is assessed via embeddings generated from DinoV2, a visual foundation model. The study identifies topical variations by utilizing a similarity matrix that highlights transitions within a prompt sequence, aiding in the calculation of the probability of a topic change, encapsulated as an exploration-exploitation metric.

Figure 2: To identify topical transitions in prompting, we calculate a similarity matrix between pairs of prompts sorted by their submission time (a), binarize the matrix (b), and identify blocks of highly-similar, consecutive prompts around the matrix diagonal (c).

Results

Image and Prompt Similarity

In analyzing the user data, it is evident that Pick-a-Pic fosters lower prompt uniqueness and shorter prompt lengths due to its interface design, which emphasizes rapid image variant selection without new prompt input. Stable Diffusion users, however, display a wider variety of prompts and a progressive increase in prompt length over time, suggesting iterative learning and refinement.

Figure 3: Distribution of number of prompts (a) and unique prompts (b) per user. Total number of unique prompts submitted after $n$ interactions (c).

Figure 4: Probability distributions of image similarity for: (a) all pairs of different prompts; (b) all pairs of identical prompts within the same prompt sequence.

Exploration vs. Exploitation

Stable Diffusion users show a greater propensity for exploration, with a majority exhibiting frequent topical transitions within their prompt sequences. Pick-a-Pic users tend to change topics less frequently and later in their prompt sequences, attributed to the design feature allowing repeat submissions with identical prompts. The research highlights that the ease of generating new image variants without revising prompts can deter users from exploring diverse prompt topics.

Figure 5: Distribution of similarity of consecutive prompts (a) and consecutive images (b) in a user sequence.

Implications and Future Research

The findings prompt reevaluation of interface designs in generative AI systems, urging developers to consider how user interfaces can encourage active engagement with formulating prompts to enhance creative outputs. Future research should extend to diverse platforms to validate these findings and explore more variables influencing user creativity, such as user demographic characteristics and domain expertise. Additionally, balancing interface ease-of-use with stimulus for creative exploration remains a critical area for further development.

Conclusion

This study underscores the significant role of interface design in shaping user behavior within generative AI environments. By highlighting differences in exploration-exploitation dynamics between Stable Diffusion and Pick-a-Pic, it provides actionable insights for designing interfaces that better support creative processes in AI-assisted artistic creation. Users benefit from systems that encourage diverse experimentation and learning through prompting, which ultimately enhances creative outcomes in AI-generated art.