Collaborating with AI Agents: Field Experiments on Teamwork, Productivity, and Performance

Published 23 Mar 2025 in cs.CY and cs.AI | (2503.18238v1)

Abstract: To uncover how AI agents change productivity, performance, and work processes, we introduce MindMeld: an experimentation platform enabling humans and AI agents to collaborate in integrative workspaces. In a large-scale marketing experiment on the platform, 2310 participants were randomly assigned to human-human and human-AI teams, with randomized AI personality traits. The teams exchanged 183,691 messages, and created 63,656 image edits, 1,960,095 ad copy edits, and 10,375 AI-generated images while producing 11,138 ads for a large think tank. Analysis of fine-grained communication, collaboration, and workflow logs revealed that collaborating with AI agents increased communication by 137% and allowed humans to focus 23% more on text and image content generation messaging and 20% less on direct text editing. Humans on Human-AI teams sent 23% fewer social messages, creating 60% greater productivity per worker and higher-quality ad copy. In contrast, human-human teams produced higher-quality images, suggesting that AI agents require fine-tuning for multimodal workflows. AI personality prompt randomization revealed that AI traits can complement human personalities to enhance collaboration. For example, conscientious humans paired with open AI agents improved image quality, while extroverted humans paired with conscientious AI agents reduced the quality of text, images, and clicks. In field tests of ad campaigns with ~5M impressions, ads with higher image quality produced by human collaborations and higher text quality produced by AI collaborations performed significantly better on click-through rate and cost per click metrics. Overall, ads created by human-AI teams performed similarly to those created by human-human teams. Together, these results suggest AI agents can improve teamwork and productivity, especially when tuned to complement human traits.

Abstract PDF Upgrade to Chat

Summary

The paper demonstrates that human-AI teams achieve 60% greater per-worker productivity and 70% more ad outputs than human-human teams.
The paper employs rigorous randomized experiments with 2310 participants and comprehensive log data to uncover distinct communication patterns and task dynamics.
The paper finds that tuning AI personality traits can complement human skills, enhancing ad quality and overall collaborative performance.

Human-AI Collaboration: Impacts on Teamwork and Performance

This research investigates the influence of AI agents on teamwork, productivity, and performance through controlled field experiments. The study introduces MindMeld, an innovative platform designed to facilitate real-time collaboration between humans and AI in integrative workspaces. By randomizing participants into human-human and human-AI teams, and by manipulating AI personality traits, the study explores how AI agents reshape collaboration dynamics and affect key outcomes. The research leverages detailed communication, collaboration, and workflow logs to provide insights into the nuances of human-AI interaction.

Experimental Design and Data Collection

The study employs a rigorous experimental design, randomizing 2310 participants into human-human and human-AI teams (Figure 1). AI agents are assigned personality profiles based on the Big Five traits, with each trait set to either a high or low level (Figure 1). The MindMeld platform (Figure 2) enables participants to collaborate on ad creation, exchanging messages and editing text and images in real-time. This setup generates a rich dataset, including 183,691 messages, 63,656 image edits, 1,960,095 ad copy edits, and 10,375 AI-generated images.

Figure 1: Overview of methods. (A) Participants are randomized into collaborating with another participant or an AI agent. (B) AI agents are assigned a personality profile based on Big Five traits, with each trait randomly set to either a high or low level. (C) Participants collaborate with another participant or an AI agent to produce ads in a real-time collaborative workspace.

Figure 2: The MindMeld platform. On the left is the task panel, and on the right is the chat panel. In the Human-Human condition, chat messages and edits on the task panel, including text edits, image selections, and AI image generations, are synchronized in real-time. In the Human-AI condition, the participant chats with an AI agent with full context of the user interface (UI; see Section~\ref{sec:methods:ai:context}), and the AI can edit text, select images, and generate AI images.

The research also includes a field evaluation, where ads created during the experiment are run on a social media platform, generating approximately 5 million impressions. Click-through rates, cost-per-click, and view-through rates are tracked to assess the real-world performance of the ads. Human and AI quality ratings of the ads are also collected, providing a comprehensive evaluation framework.

Teamwork Dynamics and Communication Patterns

The study reveals distinct communication patterns between human-human and human-AI teams (Figure 3). Human-AI teams exhibit increased communication, with 45\% more messages exchanged compared to human-human teams. These messages are more content- and process-oriented, focusing on suggestions, instructions, and planning. Conversely, human-human teams send more social and emotional messages, emphasizing rapport building and self-assessment. This shift suggests that AI agents reduce social coordination costs, enabling humans to focus more on task-related communication.

Figure 3: Participants in Human-AI teams send more process- and content-related messages while those in Human-Human teams send more social and emotional messages.

Furthermore, human-AI teams make 84\% fewer edits to the ad copy, indicating the LLM's proficiency in writing high-quality ad copy. However, LLMs are less effective at predicting image quality, leading to lower-rated images produced by human-AI teams. This highlights the importance of fine-tuning AI agents for multimodal workflows.

Productivity and Performance Outcomes

The research demonstrates that human-AI teams achieve comparable productivity to human-human teams, but with 60\% greater productivity per worker. While team-level ad submissions are similar across conditions, individuals in human-AI teams produce 70\% more ads. This suggests that AI collaboration enhances individual productivity by enabling participants to generate more outputs. Additionally, participants in human-AI teams exhibit higher completion rates for ad copy elements (Figure 4), indicating that AI support benefits even lower-performing individuals.

Figure 4: Copy completion rates.

Human evaluations of ad quality reveal that human-AI teams produce higher-quality text but lower-quality images (Figure 5). AI evaluations align with these findings, rating text quality higher for ads from human-AI teams. This divergence highlights a trade-off: while AI enhances text-focused tasks, its contributions to multimodal outputs require complementary tools for image-related tasks. Field tests show that ads with higher image quality, produced by human collaborations, and higher text quality, produced by AI collaborations, perform significantly better on click-through rate and cost-per-click metrics. Overall, ads created by human-AI teams perform similarly to those created by human-human teams.

Figure 5: AI and Human ratings of ads.

Influence of AI Personality

The study also investigates the impact of AI personality traits on collaboration outcomes. By randomizing AI prompts to induce high or low levels of Big Five personality traits, the research reveals that AI traits can complement human personalities to enhance collaboration. For example, conscientious humans paired with open AI agents improve image quality, while extroverted humans paired with conscientious AI agents reduce the quality of text, images, and clicks. These findings suggest that tuning AI agents to complement human traits can improve productivity and performance.

Implications and Future Directions

This research provides valuable insights into the dynamics of human-AI collaboration, highlighting the potential of AI agents to improve teamwork and productivity, particularly when aligned with human traits. The study underscores the importance of designing AI agents not only for efficiency but also for complementarity with human workflows. The finding that AI involvement reduces social coordination costs and enables participants to focus more on content is particularly relevant for organizations aiming to enhance team performance.

The study suggests several avenues for future research. Longitudinal studies could examine the long-term effects of human-AI collaboration, exploring the development of trust and the potential for "learning effects." Future work could also investigate how different dimensions of AI behavior, such as creativity or leadership style, influence collaboration outcomes. Additionally, exploring human-AI collaboration in diverse task domains, such as software development or data analysis, could reveal domain-specific dynamics and inform the broader design of AI systems.

Conclusion

This study advances the understanding of how AI agents reshape teamwork, productivity, and performance in collaborative settings. The findings demonstrate that human-AI teams communicate more, focus on task-related content, and achieve higher individual productivity compared to human-human teams. While AI enhances text quality, it introduces trade-offs in multimodal outputs like images. The results highlight the transformative potential of AI agents in collaborative workflows and the importance of aligning their design with human traits and task requirements.

Markdown