Intent Laundering: AI Safety Datasets Are Not What They Seem
arXiv:2602.16729v1 Announce Type: cross Abstract: We systematically evaluate the quality of widely used AI safety datasets from two perspectives: in isolation and in practice. In …
Shahriar Golchin, Marc Wetter
7 views