Robust Pre-Training of Medical Vision-and-Language Models with Domain-Invariant Multi-Modal Masked Reconstruction
arXiv:2602.17689v1 Announce Type: cross Abstract: Medical vision-language models show strong potential for joint reasoning over medical images and clinical text, but their performance often degrades …
Melika Filvantorkaman, Mohsen Piri
3 views