Aligning the True Semantics: Constrained Decoupling and Distribution Sampling for Cross-Modal Alignment
arXiv:2603.05566v1 Announce Type: new Abstract: Cross-modal alignment is a crucial task in multimodal learning aimed at achieving semantic consistency between vision and language. This requires …
Xiang Ma, Lexin Fang, Litian Xu, Caiming Zhang
19 views