All Articles

Articles

Academic · 1 min

Interpretability without actionability: mechanistic methods cannot correct language model errors despite near-perfect internal representations

arXiv:2603.18353v1 Announce Type: new Abstract: Language models encode task-relevant knowledge in internal representations that far exceeds their output performance, but whether mechanistic interpretability methods can …

Sanjay Basu, Sadiq Y. Patel, Parth Sheth, Bhairavi Muralidharan, Namrata Elamaran, Aakriti Kinra, John Morgan, Rajaie Batniji
8 views
Academic · 1 min

dTRPO: Trajectory Reduction in Policy Optimization of Diffusion Large Language Models

arXiv:2603.18806v1 Announce Type: new Abstract: Diffusion Large Language Models (dLLMs) introduce a new paradigm for language generation, which in turn presents new challenges for aligning …

Wenxuan Zhang, Lemeng Wu, Changsheng Zhao, Ernie Chang, Mingchen Zhuge, Zechun Liu, Andy Su, Hanxian Huang, Jun Chen, Chong Zhou, Raghuraman Krishnamoorthi, Vikas Chandra, Mohamed Elhoseiny, Wei Wen
120 views
Academic · 1 min

Cross-Domain Demo-to-Code via Neurosymbolic Counterfactual Reasoning

arXiv:2603.18495v1 Announce Type: new Abstract: Recent advances in Vision-Language Models (VLMs) have enabled video-instructed robotic programming, allowing agents to interpret video demonstrations and generate executable …

Jooyoung Kim, Wonje Choi, Younguk Song, Honguk Woo
23 views