| Thread Navigator

Canvas & Ratio

Choose your destination platform format

Layout Template

Choose a content structure for your slides

Preset Themes

Typography & Sizing

Font Family

Title Font Size36px

Body Font Size18px

Header & Footer Size12px

Brand Kit Customization

AGENCY

Configure brand assets for headers & footers

MULTI-PROFILES (AGENCY)

Active Brand Profile

Show Brand Watermark

Brand Watermark Text

Social Handle

Brand Logo URL (PNG) AGENCY

SAVE PRESETS (AGENCY)

Save current as Preset

Outro Slide CTA

Customize your closing call-to-action slide

CTA Title

CTA Message & Emojis

Custom CTA Buttons

Background Pattern

Source Content

Build Your Carousel

Drag and drop any post card below onto a slide, or use the quick buttons to insert content/images instantly!

Drag Post #1

Alex Prompter

@alex_prompter

what are large language models actually doing? i read the 2025 textbook "Foundations of Large Language Models" by tong xiao and jingbo zhu and for the first time, i truly understood how they work. here’s everything you need to know about llms in 3 minutes↓

Apply Image

Drag Post #2

Alex Prompter

@alex_prompter

to understand LLMs, you first need to know the idea of pre-training. instead of teaching a model to solve one task with labeled data (like classifying tweets), we train it on massive unlabeled text and let it "figure out" language patterns by itself. this is called self-supervised learning.

Drag Post #3

Alex Prompter

@alex_prompter

there are 3 pre-training strategies: → unsupervised: models learn patterns without any labels → supervised: models learn from labeled tasks → self-supervised: models generate their own labels from unlabeled data (e.g., predicting masked words) LLMs use the third one. it's the most powerful.

Drag Post #4

Alex Prompter

@alex_prompter

self-supervised pretraining works like this: take a sentence like “the early bird catches the worm.” mask some words: “the [MASK] bird catches the [MASK]” ask the model to fill in the blanks. no labels needed the text itself is the supervision.

Apply Image

Drag Post #5

Alex Prompter

@alex_prompter

this idea leads to three main architecture types: → encoder-only (BERT): reads and understands text → decoder-only (GPT): generates the next word → encoder-decoder (T5, BART): reads input, generates output each has strengths. for example: • GPT is great at generation • BERT is great at classification • T5 can do both via a "text-to-text" framework

Drag Post #6

Alex Prompter

@alex_prompter

let’s dive into each. decoder-only (GPT-style): trained to predict the next token given previous ones: “the cat sat on the [MASK]” → “mat” this is called causal language modeling. loss is calculated using cross-entropy over predicted vs. actual next words.

Apply Image

Drag Post #7

Alex Prompter

@alex_prompter

encoder-only (BERT-style): looks at the full sentence at once, masks random tokens, and tries to reconstruct them. this is masked language modeling, and it allows bidirectional context the model uses both left and right of the [MASK]. better for tasks like sentence classification.

Drag Post #8

Alex Prompter

@alex_prompter

example: original: "the early bird catches the worm" masked: "the [MASK] bird catches the [MASK]" the model learns to predict "early" and "worm" using all surrounding context. it's like learning to read by guessing missing words in a book.

Drag Post #9

Alex Prompter

@alex_prompter

encoder-decoder (T5-style): treats every NLP task as text-to-text. examples: “translate English to German: hello” → “hallo” “classify sentiment: i hate this” → “negative” the same model can be used for translation, summarization, QA, etc.

Apply Image

Drag Post #10

Alex Prompter

@alex_prompter

so, what happens after pre-training? we can either: → fine-tune: train the model further on labeled task-specific data → prompt: write smart inputs to steer the model prompting is the reason LLMs feel magical. let's talk about that next.

Drag Post #11

Alex Prompter

@alex_prompter

prompting is just carefully crafting the input to make the model behave how you want. example: "i love this movie. sentiment:" the model will likely complete: “positive”. add a few examples before the input, and the model learns the pattern. this is in-context learning.

Drag Post #12

Alex Prompter

@alex_prompter

prompting can go deep. strategies include: • chain of thought: "let's think step by step…" • problem decomposition: break big problems into smaller ones • self-refinement: ask the model to critique and revise its own output • RAG: let the model look things up externally

Apply Image

Drag Post #13

Alex Prompter

@alex_prompter

this is all possible because of the way these models are trained: predict the next word over and over until they internalize language structure, reasoning patterns, and world knowledge. it's not magic. it's scale.

Drag Post #14

Alex Prompter

@alex_prompter

but LLMs aren’t trained just to be smart — they need to be aligned with human values. how? → supervised fine-tuning (SFT): teach the model on human-written responses → RLHF (reinforcement learning from human feedback): train a reward model to prefer good outputs this is how ChatGPT was aligned.

Drag Post #15

Alex Prompter

@alex_prompter

alignment is hard. newer methods like Direct Preference Optimization (DPO) avoid the instability of RL and are becoming more popular. the goal is the same: steer models toward helpful, harmless, honest responses.

Drag Post #16

Alex Prompter

@alex_prompter

finally, inference matters. how do you actually run these massive models efficiently? → use smart decoding (top-k, nucleus sampling) → cache previous results → batch multiple requests → scale context with better memory and position interpolation this is how you get fast, low-latency responses.

Drag Post #17

Alex Prompter

@alex_prompter

in short: LLMs work because they: 1. learn from massive text via self-supervision 2. use Transformers to model token sequences 3. can be prompted/fine-tuned for any task 4. are aligned with human preferences 5. are optimized for fast inference they're general-purpose text reasoning machines.

Drag Post #18

Alex Prompter

@alex_prompter

this was based on the brilliant textbook: "Foundations of Large Language Models" by Tong Xiao and Jingbo Zhu (NiuTrans Research Lab) arxiv: <a target="_blank" href="https://arxiv.org/abs/2501.09223v2" color="blue">arxiv.org/abs/2501.09223…</a> highly recommend it if you're serious about understanding LLMs deeply.

Drag Post #19

Alex Prompter

@alex_prompter

The AI prompt library your competitors don't want you to find → Unlimited prompts: $15/month → Starter pack: $3.99/month → Pro bundle: $9.99/month Grab it before it's gone ↓ <a target="_blank" href="https://godofprompt.ai/pricing" color="blue">godofprompt.ai/pricing</a>

Drag Post #20

Alex Prompter

@alex_prompter

I hope you've found this thread helpful. Follow me @alex_prompter for more. Like/Repost the quote below if you can: <a target="_blank" href="https://twitter.com/1657385954594762758/status/1949050402810523990" color="blue">x.com/16573859545947…</a>