Formal Mechanistic Interpretability: Automated Circuit Discovery with Provable Guarantees
arXiv:2602.16823v1 Announce Type: new Abstract: *Automated circuit discovery* is a central tool in mechanistic interpretability for identifying the internal components of neural networks responsible for …
Itamar Hadad, Guy Katz, Shahaf Bassan
16 views