Probing the Latent World: Emergent Discrete Symbols and Physical Structure in Latent Representations
arXiv:2603.20327v1 Announce Type: new Abstract: Video world models trained with Joint Embedding Predictive Architectures (JEPA) acquire rich spatiotemporal representations by predicting masked regions in latent …
Liu hung ming
10 views