Sufficiency of the “commensurate risks and benefits” standard for responsible LLM tool design
Ascertain whether the criterion that non-clinical Large Language Model mental well-being tools deliver commensurate mental health risks and benefits is a sufficiently high standard to qualify such tools as responsible designs, and, if necessary, specify what higher standard should be adopted to distinguish responsible designs from those with merely commensurate tradeoffs.
References
In parallel with efforts to operationalize and extend the framing of responsible design offered in this research, there should also be work that critically examines it. We highlight two open questions already emerging from our participants’ interviews and invite future research to further critique and improve upon our findings. Is delivering “commensurate mental health risks and benefits” a sufficiently high standard for responsible LLM tool design? Experts we interviewed agreed that, in principle, a responsible LLM tool is one whose benefits and harms are at least commensurate. Based on this principle, an LLM tool analogous to a nutritional supplement (minimal guaranteed health benefits, minimal safety risks) could be considered as responsible as a tool that provides primary care (significant guaranteed health benefits, substantial safety risks due to treating individuals with existing conditions). Do we, HCI design and research communities, endorse both types of LLM tools as equally “responsible” designs? Our interviewees expressed strong but varied opinions. This divergence leaves a critical and complex question for future research to address.