Sufficiency of the “commensurate risks and benefits” standard for responsible LLM tool design

Ascertain whether the criterion that non-clinical Large Language Model mental well-being tools deliver commensurate mental health risks and benefits is a sufficiently high standard to qualify such tools as responsible designs, and, if necessary, specify what higher standard should be adopted to distinguish responsible designs from those with merely commensurate tradeoffs.

Background

The paper proposes that responsible design requires risks commensurate with guaranteed benefits, but participants disagreed on whether this standard is adequate. Some experts argued that supplement-like tools (minimal guaranteed benefits, minimal risks) and primary-care-like tools (significant guaranteed benefits, substantial risks) could both be considered responsible under this principle; others contended that limited utility or lightly regulated contexts make such tools irresponsible, suggesting the need for a stricter benchmark.

This open question calls for establishing whether “commensurate risks and benefits” sufficiently captures responsible design for LLM mental well-being tools and, if not, articulating a more stringent standard acceptable to HCI design and research communities.

References

In parallel with efforts to operationalize and extend the framing of responsible design offered in this research, there should also be work that critically examines it. We highlight two open questions already emerging from our participants’ interviews and invite future research to further critique and improve upon our findings. Is delivering “commensurate mental health risks and benefits” a sufficiently high standard for responsible LLM tool design? Experts we interviewed agreed that, in principle, a responsible LLM tool is one whose benefits and harms are at least commensurate. Based on this principle, an LLM tool analogous to a nutritional supplement (minimal guaranteed health benefits, minimal safety risks) could be considered as responsible as a tool that provides primary care (significant guaranteed health benefits, substantial safety risks due to treating individuals with existing conditions). Do we, HCI design and research communities, endorse both types of LLM tools as equally “responsible” designs? Our interviewees expressed strong but varied opinions. This divergence leaves a critical and complex question for future research to address.

Framing Responsible Design of AI Mental Well-Being Support: AI as Primary Care, Nutritional Supplement, or Yoga Instructor?  (2602.02740 - Cooper et al., 2 Feb 2026) in Section 6.2, Responsible AI Research Opportunities