โœจ Visual Editor

close

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135ยฐ

style Card Style

40px
16px

text_fields Typography

16px
Avi Chawla
@_avichawla
Let's generate our own LLM fine-tuning dataset (100% local):
Avi Chawla
@_avichawla
Before we begin, here's what we're doing today!

We'll cover:
- What is instruction fine-tuning?
- Why is it important for LLMs?

Finally, we'll create our own instruction fine-tuning dataset.

Let's dive in!
Avi Chawla
@_avichawla
Once an LLM has been pre-trained, it simply continues the sentence as if it is one long text in a book or an article.

For instance, check this to understand how a pre-trained LLM behaves when prompted ๐Ÿ‘‡
Thread image
Avi Chawla
@_avichawla
Generating a synthetic dataset using existing LLMs and utilizing it for fine-tuning can improve this.

The synthetic data will have fabricated examples of human-AI interactions.

Check this sample๐Ÿ‘‡
Thread image
Avi Chawla
@_avichawla
This process is called instruction fine-tuning.

Distilabel is an open-source framework that facilitates generating domain-specific synthetic text data using LLMs.

Check this to understand the underlying process๐Ÿ‘‡
Avi Chawla
@_avichawla
Next, let's look at the code.

First, we start with some standard imports.

Check this๐Ÿ‘‡
Thread image
Avi Chawla
@_avichawla
Moving on, we load the Llama-3 models locally with Ollama.

Here's how we do it๐Ÿ‘‡
Thread image
Avi Chawla
@_avichawla
Next, we define our pipeline:

- Load dataset.
- Generate two responses.
- Combine the responses into one column.
- Evaluate the responses with an LLM.
- Define and run the pipeline.

Check this๐Ÿ‘‡
Thread image
Avi Chawla
@_avichawla
Once the pipeline has been defined, we need to execute it by giving it a seed dataset.

The seed dataset helps it generate new but similar samples.

Check this code๐Ÿ‘‡
Thread image
Avi Chawla
@_avichawla
Done!

This produces the instruction and response synthetic dataset as desired.

Check the sample below๐Ÿ‘‡
Thread image
Avi Chawla
@_avichawla
Here's the instruction fine-tuning process again for your reference.

- Generate responses from two LLMs.
- Rank the response using another LLM.
- Pick the best-rated response and pair it with the instruction.

Check this๐Ÿ‘‡
Avi Chawla
@_avichawla
That's a wrap!

If you enjoyed this tutorial:

Find me โ†’ @_avichawla

Every day, I share tutorialsย and insights on DS, ML, LLMs, and RAGs.
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press โŒ˜ + S to quick-export