How Do Communities of ML-Enabled Systems Smell? A Cross-Sectional Study on the Prevalence of Community Smells

Published 24 Apr 2025 in cs.SE | (2504.17419v1)

Abstract: Effective software development relies on managing both collaboration and technology, but sociotechnical challenges can harm team dynamics and increase technical debt. Although teams working on ML enabled systems are interdisciplinary, research has largely focused on technical issues, leaving their socio-technical dynamics underexplored. This study aims to address this gap by examining the prevalence, evolution, and interrelations of community smells, in open-source ML projects. We conducted an empirical study on 188 repositories from the NICHE dataset using the CADOCS tool to identify and analyze community smells. Our analysis focused on their prevalence, interrelations, and temporal variations. We found that certain smells, such as Prima Donna Effects and Sharing Villainy, are more prevalent and fluctuate over time compared to others like Radio Silence or Organizational Skirmish. These insights might provide valuable support for ML project managers in addressing socio-technical issues and improving team coordination.

Abstract PDF Upgrade to Chat

Summary

An Analysis of Community Smells in ML-Enabled Systems: Prevalence and Implications

The paper titled "How Do Communities of ML-Enabled Systems Smell? A Cross-Sectional Study on the Prevalence of Community Smells" presents a focused inquiry into the socio-technical dynamics within multidisciplinary team collaborations on machine learning (ML) projects. It challenges the existing research focus that predominantly centers on addressing technical debt associated with software development, highlighting a gap in understanding the social and organizational behaviors, referred to as "community smells," within ML-enabled systems.

Study Methodology

The researchers undertook a comprehensive empirical analysis of 188 open-source ML projects derived from the NICHE dataset, employing the CADOCS tool for detection and measurement of community smells. This study was segmented into cross-sectional and longitudinal analyses, helping to delineate not only the prevalence of these community smells but also their temporal fluctuations and interrelationships.

The community smells identified in this study are defined as disruptive social patterns that have the potential to destabilize team dynamics and lead to increased social or organizational debt. Notable among these are the Prima Donna Effect (PDE), Sharing Villainy (SV), and Solution Defiance (SD), each indicating various degrees of communication isolation and inefficiencies among team members.

Key Findings

Prevalence of Community Smells:
- The Prima Donna Effect was highly prevalent, noted in 92.6% of the projects. This suggests a persistent challenge within teams, often resulting from dominant behaviors among certain team members, such as data scientists, which causes disruption in collaborative efforts.
- Sharing Villainy (83.5%) and Solution Defiance (76.1%) further underscore systemic communication breakdowns and the emergence of conflicting subgroups.
- In contrast, Radio Silence (18.6%) and Organizational Skirmish (30.9%) were less prevalent, indicating that formal communication breakdowns and role-based conflicts were relatively infrequent.
Temporal Patterns:
- The study's analysis over time discovered that while PDE remains a constant challenge, community smells like SV and SD decreased as teams matured, reflecting potential adaptation and alignment in team dynamics. However, smells such as Organizational Silo Effect (OSE) and Toxic Communication (TC) showed an increasing trend, suggesting emerging communication silos as projects mature.
Relationships Among Community Smells:
- The analysis highlighted strong positive correlations between certain smells, for instance, between PDE and OSE (POR = 4.34), suggesting that these issues tend to co-occur, likely denoting interconnected social dynamics that exacerbate each other.
- Negative correlations were also noted, such as between Unhealthy Interaction and Organizational Skirmish (POR = 0.39), indicating the potential mitigating effect some team dynamics can have on others.

Implications for Future Research and Practice

The insights from the empirically grounded investigation presented in this paper have multifaceted implications:

For Researchers: There is a necessity to delve deeper into understanding how the distinct roles within ML-enabled teams contribute to community smells and to explore strategies for realigning interdisciplinary workflows to curb the socio-technical debt. This study lays foundational ground for future research that can involve more granular qualitative studies or causal analysis of these socio-technical dynamics.
For Practitioners: Awareness and proactive management of community smells are essential to foster healthier and more collaborative team environments. Strategies such as balanced leadership roles, cross-discipline communication frameworks, and regular team assessments could be developed to mitigate the prevailing community smells identified in this study.

Conclusion

This paper offers a critical examination of the social undercurrents within ML-enabled system projects, uncovering the multifarious community smells that manifest in these environments. By highlighting these socio-technical patterns, the study provides practical insights and a research direction aimed at enhancing collaboration paradigms and reducing socio-technical debt within the context of ML-enabled systems.