I-JEPA
Image-Joint Embedding Predictive Architecture
Last updated
Image-Joint Embedding Predictive Architecture
Last updated
Image-Joint Embedding Predictive Architecture (I-JEPA) - A framework that trains a pair of Vision Transformers (ViTs) to learn a joint embedding by learning to predict values for masked out parts of an input tile.
References:
Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, Nicolas Ballas: “Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture”, 2023; arXiv:2301.08243.