VL-JEPA: Joint Embedding Predictive Architecture for Vision-Language
Posted 6 hours ago by
andsoitis
1
points
https://arxiv.org/abs/2512.10942
0
comments