Carousel Studio

Repurpose X Threads into LinkedIn & Instagram Carousels

Canvas & Ratio

Choose your destination platform format


Layout Template

Choose a content structure for your slides


Preset Themes


Typography & Sizing

Title Font Size36px
Body Font Size18px
Header & Footer Size12px

Brand Kit Customization

AGENCY

Configure brand assets for headers & footers

MULTI-PROFILES (AGENCY)
AGENCY
SAVE PRESETS (AGENCY)

Outro Slide CTA

Customize your closing call-to-action slide

#1
#2
#3

Background Pattern

Source Content

Build Your Carousel

Drag and drop any post card below onto a slide, or use the quick buttons to insert content/images instantly!

Drag Post #1
Akshay 🚀
@akshay_pachaar

How LLMs work, clearly explained:

Drag Post #2
Akshay 🚀
@akshay_pachaar

Before diving into LLMs, we must understand conditional probability. Let's consider a population of 14 individuals: - Some of them like Tennis 🎾 - Some like Football ⚽️ - A few like both 🎾 ⚽️ - And few like none Here's how it looks 👇

Apply Image
Drag Post #3
Akshay 🚀
@akshay_pachaar

So what is Conditional probability ⁉️ It's a measure of the probability of an event given that another event has occurred. If the events are A and B, we denote this as P(A|B). This reads as "probability of A given B" Check this illustration 👇

Apply Image
Drag Post #4
Akshay 🚀
@akshay_pachaar

For instance, if we're predicting whether it will rain today (event A), knowing that it's cloudy (event B) might impact our prediction. As it's more likely to rain when it's cloudy, we'd say the conditional probability P(A|B) is high. That's conditional probability for you! 🎉

Drag Post #5
Akshay 🚀
@akshay_pachaar

Now, how does this apply to LLMs like GPT-4❓ These models are tasked with predicting the next word in a sequence. This is a question of conditional probability: given the words that have come before, what is the most likely next word?

Apply Image
Drag Post #6
Akshay 🚀
@akshay_pachaar

To predict the next word, the model calculates the conditional probability for each possible next word, given the previous words (context). The word with the highest conditional probability is chosen as the prediction.

Apply Image
Drag Post #7
Akshay 🚀
@akshay_pachaar

The LLM learns a high-dimensional probability distribution over sequences of words. And the parameters of this distribution are the trained weights! The training or rather pre-training** is supervised. I'll talk about the different training steps next time!** Check this 👇

Apply Image
Drag Post #8
Akshay 🚀
@akshay_pachaar

But there a problem❗️ If we always pick the word with the highest probability, we end up with repetitive outputs, making LLMs almost useless and stifling their creativity. This is where temperature comes into picture. Check this before we understand more about it...👇

Apply Image
Drag Post #9
Akshay 🚀
@akshay_pachaar

However a high temperate value produces gibberish Let's understand what's going on...👇

Apply Image
Drag Post #10
Akshay 🚀
@akshay_pachaar

So, the LLMs instead of selecting the best token (for simplicity let's think of tokens as words), they "sample" the prediction. So even if “Token 1” has the highest score, it may not be chosen since we are sampling.

Apply Image
Drag Post #11
Akshay 🚀
@akshay_pachaar

Now, temperature introduces the following tweak in the softmax function, which, in turn, influences the sampling process:

Apply Image
Drag Post #12
Akshay 🚀
@akshay_pachaar

Let take a code example! At low temperature, probabilities concentrate around the most likely token, resulting in nearly greedy generation. At high temperature, probabilities become more uniform, producing highly random and stochastic outputs. Check this out👇

Apply Image
Drag Post #13
Akshay 🚀
@akshay_pachaar

That's a wrap! Hopefully, this guide has demystified some of the magic behind LLMs. And, if you enjoyed this breakdown: Find me → @akshay_pachaar ✔️ For more insights and tutorials on AI and Machine Learning.