Multi-Trait Subspace Steering to Reveal the Dark Side of Human-AI Interaction
arXiv:2603.18085v1 Announce Type: new Abstract: Recent incidents have highlighted alarming cases where human-AI interactions led to negative psychological outcomes, including mental health crises and even …
Xin Wei Chia, Swee Liang Wong, Jonathan Pan
8 views