Human-AI Collaboration in Decision-Making: Beyond Learning to Defer

Published 27 Jun 2022 in cs.LG, cs.AI, and cs.HC | (2206.13202v2)

Abstract: Human-AI collaboration (HAIC) in decision-making aims to create synergistic teaming between human decision-makers and AI systems. Learning to defer (L2D) has been presented as a promising framework to determine who among humans and AI should make which decisions in order to optimize the performance and fairness of the combined system. Nevertheless, L2D entails several often unfeasible requirements, such as the availability of predictions from humans for every instance or ground-truth labels that are independent from said humans. Furthermore, neither L2D nor alternative approaches tackle fundamental issues of deploying HAIC systems in real-world settings, such as capacity management or dealing with dynamic environments. In this paper, we aim to identify and review these and other limitations, pointing to where opportunities for future research in HAIC may lie.

Abstract PDF Upgrade to Chat

Citations (14)

View on Semantic Scholar

Summary

The paper introduces a framework for robust human-AI collaboration by addressing challenges like incomplete human predictions and dynamic operational environments.
The paper critiques conventional Learning to Defer methods for their impractical data requirements and limited adaptability in real-world applications.
The paper emphasizes integrating fairness and capacity management to optimize performance and stability in evolving decision-making contexts.

Human-AI Collaboration in Decision-Making

Introduction

The paper "Human-AI Collaboration in Decision-Making: Beyond Learning to Defer" (2206.13202) explores the intricacies of integrating AI systems into decision-making processes shared with human participants. The emphasis is placed on optimizing the collaborative performance while tackling issues such as fairness and real-world applicability. Traditionally, frameworks like Learning to Defer (L2D) have been utilized to decide whether a human or an AI system should make a particular decision based on performance optimization criteria. However, these approaches often require impractical demands, such as comprehensive human predictions for every possible scenario. Moreover, existing frameworks overlook the operational realities faced in dynamic environments, such as capacity constraints and concept drift, thereby necessitating a more robust and flexible approach to Human-AI Collaboration (HAIC).

Learning to Defer: An Overview

The L2D framework addresses the assignment of decision-making responsibilities between humans and AI. It treats the assignment problem as a classification task where a model is trained to select the best decision-maker. The existing approach demands predictions from human decision-makers for each instance in the training data, which poses significant feasibility issues in real-world scenarios where such comprehensive labeling is impractical.

L2D operates by minimizing a system-wide loss through a learning scheme that incorporates a penalty for deferral decisions. This penalty, coupled with considerations for model and human biases, enables the system to optimize for both accuracy and fairness. However, despite its innovative structure, L2D is limited by its dependency on complete data sets and its inability to adapt to changes in environmental dynamics or capacity management challenges.

Challenges and Limitations

Data Requirements

L2D necessitates human predictions for every training instance, which can be impossible to achieve in real-world applications like fraud detection, where human reviewers are tasked with assessing only the most challenging cases. Additionally, imputation techniques proposed to handle missing predictions may fail if the data distribution is non-random, a common occurrence in practical settings where deferral decisions are based on confidence metrics.

Specialization vs. Robustness

Joint training within L2D results in the AI specializing away from deferred instances, hindering its advisory capacity in HAIC systems where AI insights aid human decision-makers. This specialization reduces system robustness, particularly when unforeseen changes in human participation occur, thus demanding approaches that ensure the AI remains informative and adaptable across all scenarios.

Multi-Expert Collaboration

Managing deferrals across teams of experts introduces complexity within L2D as the framework assumes predictions from all decision-makers, which is rarely feasible. Future research should prioritize methods capable of leveraging partial information about expert behaviors without exhaustive predictions, potentially via reinforcement learning techniques.

Capacity Management

HAIC systems need a mechanism for managing human capacity constraints, which L2D does not inherently provide. Optimal deferral should not only focus on accuracy but also on effectively utilizing human resources within their operational limits, an area ripe for research development.

Selective Labels

Real-world applications often operate under conditions where actions influence observable outcomes, an aspect not adequately addressed by L2D. While alternatives using reward-based frameworks approach the problem differently, they still require exhaustive human decision data, which is a limitation when selective labels are present.

Fairness Considerations

Maintaining fairness is crucial given that both human and AI systems can exhibit biases. L2D integrates fairness in its design, but other contributions within this domain often neglect this aspect, which is critical as AI systems increasingly impact decisions affecting social and economic spheres.

Dynamic Environments

The adaptability of HAIC systems to dynamic environments is largely unaddressed. Concept drift and adversarial influences require systems that either endogenously adjust or can be updated efficiently. Future research should aim to develop methods that allow for continuous learning in such settings.

Conclusion

The paper effectively highlights the limitations of existing HAIC frameworks and calls for substantial advancements that accommodate real-world complexities. A comprehensive system should not only address fairness and performance but also incorporate mechanisms for dynamic adaptability and resource management, ensuring HAIC systems are both practical and equitable in diverse operational environments.