Canvas & Ratio
Choose your destination platform format
Layout Template
Choose a content structure for your slides
Preset Themes
Typography & Sizing
Brand Kit Customization
AGENCYConfigure brand assets for headers & footers
Outro Slide CTA
Customize your closing call-to-action slide
Background Pattern
Build Your Carousel
Drag and drop any post card below onto a slide, or use the quick buttons to insert content/images instantly!

LLMs learn by predicting tokens. World models (JEPA, data2vec) learn by predicting their own abstractions. Which needs more data? For data with hidden hierarchy, we prove the gap is exponential. <a target="_blank" href="https://arxiv.org/pdf/2605.27734" color="blue">arxiv.org/pdf/2605.27734</a>


Why? A network discovers a latent variable from its correlation with a prediction target. Correlations between latents at the same level of abstraction are far stronger than between a latent and raw tokens. Token prediction dilutes the signal that latent prediction amplifies.

We make this precise on simple context-free grammars. Token-level SSL need a sample size exponential in the depth of the latent tree. Learning from your own latents is nearly independent of depth. We show that data2vec implicitly does exactly this hierarchical latent prediction.

A consequence: if a single latent-prediction module (data2vec) is already implicitly multi-scale, then explicitly stacking them (e.g. H-JEPA) is to some extent redundant. Work led by @DanKorchinski & @alesfav.

@DanKorchinski @alesfav see excellent threads by @DanKorchinski <a target="_blank" href="https://x.com/DanKorchinski/status/2060344980749549607" color="blue">x.com/DanKorchinski/…</a> and @alesfav . <a target="_blank" href="https://x.com/alesfav/status/2060277596089159734" color="blue">x.com/alesfav/status…</a>