Lying to Win: Assessing LLM Deception through Human-AI Games and Parallel-World Probing
arXiv:2603.07202v1 Announce Type: new Abstract: As Large Language Models (LLMs) transition into autonomous agentic roles, the risk of deception-defined behaviorally as the systematic provision of …
Arash Marioriyad, Ali Nouri, Mohammad Hossein Rohban, Mahdieh Soleymani Baghshah
14 views