Morally Programmed LLMs Reshape Human Morality

Published 11 Apr 2026 in cs.CY | (2604.10222v1)

Abstract: As LLMs increasingly participate in high-stakes decision-making, a central societal debate has revolved around which moral frameworks-deontological or utilitarian-should guide machine behavior. However, a largely overlooked question is whether the moral principles that humans encode in LLMs could, through repeated interactions, reshape human moral inclinations. We developed two LLMs programmed with either deontological principles (D-LLM) or utilitarian principles (U-LLM) and conducted two pre-registered experiments involving extensive human-LLM interactions, comprising 15,985 total exchanges across the two experiments. Results show that interacting with these morally programmed LLMs systematically shifted human moral inclinations to align with the principles embedded in these systems. These effects remained strong two weeks after the interaction, with only slight decay, suggesting deep internalization rather than superficial agreement. Further, LLM-induced shifts in human moral inclinations translated into meaningful changes in socio-political policy evaluations, shaping how individuals approach contentious social issues. Overall, these results demonstrate that morally programmed LLMs can shape-not merely reflect-human morality, revealing a critical design paradox: embedding moral principles in LLMs not only restricts their behavior but also poses the risk of shaping human morality, raising important ethical and policy questions about who determines which principles intelligent machines should adhere to.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper demonstrates that interactions with deontological and utilitarian LLMs cause significant and persistent shifts in human moral inclinations, with effect sizes up to d=0.98.
The study employs robust longitudinal and behavioral experiments, including process dissociation methods, to assess changes in moral judgments and policy evaluations.
Findings raise ethical concerns about covert moral engineering in AI, questioning governance, user autonomy, and the broader impacts on democratic discourse.

Morally Programmed LLMs and the Reshaping of Human Moral Inclinations

Introduction

This paper, "Morally Programmed LLMs Reshape Human Morality" (2604.10222), provides rigorous empirical evidence that interacting with LLMs encoded with specific moral principles causes systematic and persistent shifts in human moral inclinations. The study addresses a critical, underexplored question in AI ethics and sociotechnical systems: while the morality of LLMs is usually designed as a unidirectional reflection of human values, can LLMs themselves reciprocally influence and reshape the moral frameworks of their users? The authors adopt a comprehensive approach, implementing both deontological (D-LLM) and utilitarian (U-LLM) LLMs, and assessing changes in participants' core moral inclinations and downstream real-world decision-making, utilizing robust longitudinal and behavioral experimental designs.

Experimental Design and Methodology

Two pre-registered experiments form the foundation of the work. The D-LLM and U-LLM were derived from GPT-4o and validated to consistently generate responses aligned with their respective moral frameworks. The process dissociation (PD) method—widely regarded as a gold standard in moral psychology—was used to independently quantify deontological and utilitarian inclinations using batteries of congruent and incongruent moral dilemmas.

Study 1: Longitudinal Influence

A three-wave longitudinal experiment with 124 participants measured baseline (T1), immediate post-interaction (T2), and two-weeks-post-interaction (T3) moral inclination scores. Participants were randomly assigned to D-LLM or U-LLM conditions and engaged in real-time, multi-turn LLM-facilitated deliberations on 20 classic moral dilemmas.

Study 2: Downstream Socio-Political Consequences

This experiment involved 274 U.S. participants divided into D-LLM, U-LLM, and no-interaction control groups. After the same LLM-based dilemma interactions as Study 1, participants evaluated 11 controversial socio-political policy proposals, rating both agreement and voting intentions. This allowed quantification of indirect effects of LLMs' moral programming on applied, domain-general policy judgments.

Key Findings

LLM-Induced Moral Shifts

Selective alignment: Interaction with D-LLMs increased deontological inclinations (effect sizes d = 0.90 at T2, d = 0.53 at T3), whereas U-LLMs selectively increased utilitarian inclinations (d = 0.98 at T2, d = 0.73 at T3). These effects were robust and statistically significant (all p < 0.001).
Persistence: Shifts persisted at least two weeks post-interaction; deontological shifts decayed slightly but remained above baseline, while utilitarian shifts were stable.
Control for compliance: The enduring magnitude of these shifts, alongside the two-week washout, supports internalization rather than superficial compliance.

Policy Evaluation and Democratic Implications

Deontological LLMs: D-LLM-induced increases in deontological inclinations led to systematically more negative attitudes towards all new policy proposals across domains, with significant indirect effects on both agreement and voting intentions for 10 out of 11 issues. This implies a structural conservatism—prioritizing omission and avoidance of moral transgression—manifesting as resistance to policy change.
Utilitarian LLMs: U-LLM-induced increases in utilitarian inclination did not produce reliable, consistent changes in policy evaluations. The authors attribute this to heterogeneity in participants' consequentialist calculus and outcome expectations across divergent policy domains.
No direct policy messaging: Notably, the LLMs did not advocate specific policies, but only discussed abstract moral dilemmas. The observed downstream effects are therefore attributable to shifts in underlying inclinations rather than direct argumentative influence on policy.

Design Paradox and Ethical Risks

The core claim—highlighted as a "design paradox"—is that embedding explicit moral principles into LLMs not only constrains machine behavior, but also actively engineers the moral sensibilities of users over repeated exposures. This dual impact calls into question traditional views of AI as merely reflective or neutral, underlining the sociotechnical feedback loop engendered by LLM-human interaction.

Theoretical and Practical Implications

AI Ethics and Governance

The results expose a dilemma for value alignment in LLM systems: explicit programming of moral principles, intended to ensure compliant and ethical machine action, introduces the risk of covert moral engineering of the user base. This raises profound questions about the legitimacy and governance of AI ethics programming:

Agency and authority: Who should decide the moral principles instantiated in widely deployed LLMs? How should transparency, user consent, and heterogeneity of user values be reconciled with safety and coherence in system design?
Democratic discourse: LLMs could indirectly modulate public opinion and policy preferences not by overt argument, but by gradually shifting the moral frameworks users apply, potentially at societal scale.

Human-Computer Interaction and Persuasion

The work suggests that the impersonal nature of LLMs, their consistency, and their perceived motivational neutrality enhance their capacity for deep, persistent moral persuasion. This supports a new paradigm in computer-mediated attitude change distinct from classical models of social influence, and requires a re-appraisal of how persistent, large-scale exposure to generative systems may shape norms, beliefs, and sociopolitical landscapes.

Limitations and Directions for Future Research

Individual differences: The study's randomization precluded detailed conditioning on personality, ideology, or cognitive style; further research should statistically model these moderators of susceptibility to LLM-induced moral shift.
Closed-world nature of dilemmas: Moral dilemma tasks necessarily abstract from real-world complexity and uncertainty; real-life effects may diverge due to contextual variables and outcome ambiguity.
Temporal and scale sensitivity: While two-week persistence is significant, longer-term and population-scale effects remain to be empirically validated.

Conclusion

This paper provides compelling empirical evidence that morally programmed LLMs systematically and persistently reshape human moral inclinations, and that these shifts can have consequential impact on downstream socio-political decision-making. The findings reveal a critical ethical and practical paradox in the development and deployment of value-aligned LLMs: the instantiation of explicit moral programming in AI systems, conceived as a safeguard, may constitute an unprecedented conduit for broad-based, covert influence on human morality. Addressing this challenge will require interdisciplinary collaboration and careful re-examination of governance, transparency, and user autonomy in AI ethics.

Markdown Report Issue