Don't Blink: Evidence Collapse during Multimodal Reasoning
arXiv:2604.04207v1 Announce Type: new Abstract: Reasoning VLMs can become more accurate while progressively losing visual grounding as they think. This creates task-conditional danger zones where …
Suresh Raghu, Satwik Pandey
3 views