Enhancing Instruction Following of LLMs via Activation Steering with Dynamic Rejection
arXiv:2603.06745v1 Announce Type: new Abstract: Large Language Models (LLMs), despite advances in instruction tuning, often fail to follow complex user instructions. Activation steering techniques aim …
Minjae Kang, Jaehyung Kim
10 views