- The paper presents a systematic mapping of fairness definitions and evaluation metrics in multi-agent systems within software engineering.
- It identifies key harm clusters such as representational, QoS, security, and governance issues that emerge across SDLC stages.
- The study reveals significant methodological gaps and calls for developing MAS-aware benchmarks and robust mitigation strategies.
Fairness in Multi-Agent Systems for Software Engineering: A Rapid Review
Overview and Motivation
The increasing prevalence of transformer-based LLMs and multi-agent systems (MAS) in the software development lifecycle (SDLC) introduces unique fairness, bias, and accountability challenges. As LLM-powered agents transition from isolated assistant tools to collaborative, autonomous, and agentic architectures embedded throughout coding, testing, deployment, and operations, the potential for interaction-driven inequities intensifies. This rapid review systematically maps the fragmented and heterogeneous research on fairness in MAS, with a particular focus on LLM-based approaches relevant to software engineering workflows. By screening 350 studies and synthesizing findings from 18, the review characterizes current conceptualizations of fairness, empirical evaluation practices, observed harms, and identifies widespread methodological and empirical gaps hindering reliable, deployable, and governance-compatible MAS in real-world SDLC contexts.
(Figure 1)
Figure 1: Study selection flow for the rapid review.
Fairness: Definitions and Measurement Taxonomy
The analysis identifies three primary, mostly non-overlapping, strands in how fairness is articulated within the MAS literature:
- Bias Reduction: Most commonly, fairness is framed as social bias minimization, targeting stereotyping and disparate treatment across attributes such as gender, race, and political ideology. Typical evaluations use benchmark-based predictive metrics (e.g., BBQ, MALIBU) and group demographic disparity statistics. However, these evaluations are usually decoupled from the unique interactive and coordination dynamics characterizing MAS.
- Normative, Governance, and Trustworthy AI: Another line anchors fairness in regulatory (e.g., EU AI Act, ISO/IEC 42001:2023) and ethics-oriented principles, extending from explainability and accountability to robust oversight and non-discrimination in agentic human-AI collaboration. While these frameworks promote trustworthy integration, their lack of operationalized, context-specific metrics for MAS impedes empirical verification.
- Interactional Dynamics: A distinct thread regards fairness as an emergent property of inter-agent processes, such as consensus formation, task decomposition, or responsibility allocation. Here, fairness failures arise procedurally: group conformity, bias amplification, minority suppression, and collective misalignment are diagnosed using MAS-specific behavioral metrics rather than classical outcome-based disparities.
The heterogeneity across these conceptual frameworks results in incompatible evaluation protocols. Benchmark and group-disparity methods (C1/C2) typically overlook multi-agent interaction effects that drive real deployment risks, while MAS-specific procedural metrics (C3) are largely divorced from downstream demographic impact. Governance studies (C4) infrequently connect their normative claims to measurable system behavior.
Documented Harms and SDLC Mapping
The review maps observed and anticipated fairness failures across SDLC stages using a functional taxonomy. Key harm clusters include:
- Representational Harms: MAS settings induce and amplify stereotyping, marginalization, and underrepresentation, both through persona-driven interactions (e.g., agent role assignment, conversational debate) and division-of-labor patterns.
- Quality-of-Service (QoS) Harms: Differential accuracy, hallucinations, and unsafe system recommendations disproportionately affect protected groups, reflecting both pre-existing LLM biases and their MAS-amplified variants.
- Security and Privacy Harms: Inter-agent prompt injection, data leakage, and cascading operational failures are uniquely exacerbated by coordinated agentic architectures.
- Governance and Trust: Opaque decision rationales, ambiguous responsibility chains, and weak oversight mechanisms challenge both regulatory compliance and practical diagnosis.
(Figure 2)
Figure 2: Coding pipeline mapping extracted evidence about implicit bias amplification to representation/governance harm and SDLC testing stage.
These harms are most frequently explored within requirements, testing, architectural design, and maintenance, with limited empirical grounding for middle-of-life developer activities (e.g., code review, design iteration, debugging). Notably, iterative agent consensus mechanisms predictably suppress minority perspectives, and role-based task decomposition instantiates new indirect amplification channels for bias.
Methodological Gaps
Three core research gaps limit the translational value of the studied literature:
- Fragmented, Non-MAS-Specific Evaluations: Most available evaluations lack standardization. Confounding factors (interaction protocol, agent count, task structure, context window, etc.) are rarely isolated, making claims non-comparable and obscuring true agentic effects versus prompt or architecture artifacts.
- Limited Generalization and Attribute Breadth: Studies mainly demonstrate effects in domain-agnostic or simulated environments with narrow protected attribute coverage, rarely exploring intersectionality or MAS-specific software phenomena (e.g., code artifact-level outcomes).
- Scarce, Weakly Evaluated Mitigation Mechanisms: Most intervention proposals are not validated under realistic, SDLC-embedded, agentic settings. There is minimal empirical analysis regarding cost, scalability, protocol-agnosticism, or operational constraints.
Implications and Future Directions
The fragmented state of fairness evaluation and governance standards impedes the deployment of MAS that are trustworthy, transparent, and regulatory-aligned. Progress requires:
- MAS-aware benchmarks and stress tests capturing procedural, not just outcome-level, risk
- Consistent, agent-rich evaluation protocols
- Broader exploration of intersectional/pervasive harm types, including those manifesting during collaborative SDLC tasks
- Empirically validated mitigation and governance pipelines, stress-tested under adversarial and real-world operational loads
Theoretical advances in agent responsibility allocation, collective responsibility, and trust calibration also remain underdeveloped for the SDLC context.
Conclusion
The current evidence base for fairness in MAS-enabled software engineering is disjoint and cannot yet support deployable, audit-ready, fairness-assured MAS. Significant work remains to unify benchmark and protocol design, extend beyond simulated harms towards measurable impacts in SDLC pipelines, and close the gap between normative governance claims and realized agentic behavior. Achieving this is essential to move MAS from laboratory prototypes to critical, socially aligned infrastructure for future AI-native software engineering systems (2604.13103).