Finding Highly Interpretable Prompt-Specific Circuits in Language Models
arXiv:2602.13483v1 Announce Type: new Abstract: Understanding the internal circuits that language models use to solve tasks remains a central challenge in mechanistic interpretability. Most prior …
Gabriel Franco, Lucas M. Tassis, Azalea Rohr, Mark Crovella
13 views