Canvas & Ratio
Choose your destination platform format
Layout Template
Choose a content structure for your slides
Preset Themes
Typography & Sizing
Brand Kit Customization
AGENCYConfigure brand assets for headers & footers
Outro Slide CTA
Customize your closing call-to-action slide
Background Pattern
Build Your Carousel
Drag and drop any post card below onto a slide, or use the quick buttons to insert content/images instantly!

AI needs vastly more data than we do. One idea might close the gap: don't predict raw signals (tokens), predict your own abstract latent representation (JEPA, data2vec). With @DanKorchinski @MatthieuWyart, on a toy model, we prove how much that helps: the gap is exponential. ๐งต


We study recovering the hidden latent tree of a hierarchical grammar. Token-level SSL pays a depth tax: the data it needs grows exponentially with the tree's depth. We prove that iteratively supervising on latents escapes it, recovering the tree with constant-in-depth data!


Surprisingly, we found data2vec already does this with a single module. Through its teacher, it implicitly supervises on latents at every level, reaching the same constant-in-depth scaling. ๐คฏ The hierarchy unfolds during training rather than being stacked into the architecture.

This result also suggests that explicit stacking, like H-JEPA, may be redundant. Many open questions! ๐ Our paper: <a target="_blank" href="https://arxiv.org/abs/2605.27734" color="blue">arxiv.org/abs/2605.27734</a>

@TMoldwin @DanKorchinski @MatthieuWyart Latent prediction avoids that bottleneck by learning one level, then using that learned level as the target/context for the next. We may write a more accessible blog post version at some point!