Academic

Between Rules and Reality: On the Context Sensitivity of LLM Moral Judgment

arXiv:2603.23114v1 Announce Type: new Abstract: A human's moral decision depends heavily on the context. Yet research on LLM morality has largely studied fixed scenarios. We address this gap by introducing Contextual MoralChoice, a dataset of moral dilemmas with systematic contextual variations known from moral psychology to shift human judgment: consequentialist, emotional, and relational. Evaluating 22 LLMs, we find that nearly all models are context-sensitive, shifting their judgments toward rule-violating behavior. Comparing with a human survey, we find that models and humans are most triggered by different contextual variations, and that a model aligned with human judgments in the base case is not necessarily aligned in its contextual sensitivity. This raises the question of controlling contextual sensitivity, which we address with an activation steering approach that can reliably increase or decrease a model's contextual sensitivity.

A
Adrian Sauter, Mona Schirmer
· · 1 min read · 19 views

arXiv:2603.23114v1 Announce Type: new Abstract: A human's moral decision depends heavily on the context. Yet research on LLM morality has largely studied fixed scenarios. We address this gap by introducing Contextual MoralChoice, a dataset of moral dilemmas with systematic contextual variations known from moral psychology to shift human judgment: consequentialist, emotional, and relational. Evaluating 22 LLMs, we find that nearly all models are context-sensitive, shifting their judgments toward rule-violating behavior. Comparing with a human survey, we find that models and humans are most triggered by different contextual variations, and that a model aligned with human judgments in the base case is not necessarily aligned in its contextual sensitivity. This raises the question of controlling contextual sensitivity, which we address with an activation steering approach that can reliably increase or decrease a model's contextual sensitivity.

Executive Summary

This article explores the context sensitivity of large language models (LLMs) in moral judgment, introducing a dataset called Contextual MoralChoice to study how LLMs respond to varying moral dilemmas. The findings indicate that nearly all LLMs are context-sensitive, often shifting towards rule-violating behavior in response to different contextual variations. The study also reveals discrepancies between LLMs and human judgments in terms of contextual sensitivity, highlighting the need for controlling this sensitivity in LLMs.

Key Points

  • Introduction of Contextual MoralChoice dataset to study LLM moral judgment
  • Nearly all LLMs exhibit context sensitivity in moral decision-making
  • Discrepancies between LLM and human judgments in contextual sensitivity

Merits

Novel Dataset

The introduction of the Contextual MoralChoice dataset provides a systematic approach to studying LLM moral judgment in varying contexts.

Demerits

Limited Generalizability

The study's findings may not generalize to all LLMs or real-world scenarios, potentially limiting the applicability of the results.

Expert Commentary

The study's findings underscore the importance of considering context in LLM moral judgment. The discrepancies between LLM and human judgments highlight the need for further research into the development of more nuanced and human-aligned LLMs. The activation steering approach proposed by the authors offers a promising solution for controlling contextual sensitivity, but its effectiveness and potential applications require further exploration.

Recommendations

  • Future studies should investigate the applicability of the Contextual MoralChoice dataset to other AI models and real-world scenarios.
  • Developers should prioritize the creation of more transparent and explainable LLMs that can provide insights into their moral decision-making processes.

Sources

Original: arXiv - cs.AI