Skip to main content

Category

Academic

Academic · 1 min

Test-Time Adaptation for Tactile-Vision-Language Models

arXiv:2602.15873v1 Announce Type: cross Abstract: Tactile-vision-language (TVL) models are increasingly deployed in real-world robotic and multimodal perception tasks, where test-time distribution shifts are unavoidable. Existing …

Chuyang Ye, Haoxian Jing, Qinting Jiang, Yixi Lin, Qiang Li, Xing Tang, Jingyan Jiang
7 views
Academic · 1 min

Genetic Generalized Additive Models

arXiv:2602.15877v1 Announce Type: cross Abstract: Generalized Additive Models (GAMs) balance predictive accuracy and interpretability, but manually configuring their structure is challenging. We propose using the …

Kaaustaaub Shankar, Kelly Cohen
7 views
Academic · 1 min

FUTURE-VLA: Forecasting Unified Trajectories Under Real-time Execution

arXiv:2602.15882v1 Announce Type: cross Abstract: General vision-language models increasingly support unified spatiotemporal reasoning over long video streams, yet deploying such capabilities on robots remains constrained …

Jingjing Fan, Yushan Liu, Shoujie Li, Botao Ren, Siyuan Li, Xiao-Ping Zhang, Wenbo Ding, Zhidong Deng
7 views
Academic · 1 min

Egocentric Bias in Vision-Language Models

arXiv:2602.15892v1 Announce Type: cross Abstract: Visual perspective taking--inferring how the world appears from another's viewpoint--is foundational to social cognition. We introduce FlipSet, a diagnostic benchmark …

Maijunxian Wang, Yijiang Li, Bingyang Wang, Tianwei Zhao, Ran Ji, Qingying Gao, Emmy Liu, Hokin Deng, Dezhi Luo
7 views
Academic · 1 min

Doc-to-LoRA: Learning to Instantly Internalize Contexts

arXiv:2602.15902v1 Announce Type: cross Abstract: Long input sequences are central to in-context learning, document understanding, and multi-step reasoning of Large Language Models (LLMs). However, the …

Rujikorn Charakorn, Edoardo Cetin, Shinnosuke Uesaka, Robert Tjarko Lange
6 views