Reliable Alignment of AI Behavior with Complex Values
Develop reliable methodologies to align AI behavior with complex human values so that AI systems consistently pursue intended objectives rather than undesirable goals.
References
Moreover, no one currently knows how to reliably align AI behavior with complex values; several research breakthroughs are needed (see below).
— Managing extreme AI risks amid rapid progress
(2310.17688 - Bengio et al., 2023) in Subsection Societal-scale risks