✨ Visual Editor

close

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135°

style Card Style

40px
16px

text_fields Typography

16px
Avi Chawla
@_avichawla
Let's build a RAG app over audio files with DeepSeek-R1 (running locally):
Avi Chawla
@_avichawla
Before we begin, here's a quick demo of what we're building!

We will use:

- @AssemblyAI for transcribing audio files.
- @qdrant_engine for the vector database.
- @llama_index for orchestration.
- DeepSeek-R1 as the LLM.

Let's dive in!
Video thumbnail
VIDEO
Avi Chawla
@_avichawla
Here's an overview of our app:

• 1) Takes an audio file and transcribes it using @AssemblyAI.
• 2-3) Stores it in a Qdrant vector database.
• 4-6) Queries the database to get context.
• 7-8) Uses DeepSeek-R1 as the LLM to generate a response.

Now let's jump into code!
Avi Chawla
@_avichawla
0️⃣ Get the API key

To transcribe audio files, get an API key from AssemblyAI and store it in the `.env` file:
Thread image
Avi Chawla
@_avichawla
1️⃣ Transcription

We use AssemblyAI to transcribe audio with speaker labels. To do this:
- We set up the transcriber object.
- We enable speaker label detection in the config.
- We transcribe the audio using AssemblyAI.

Check this code👇
Thread image
Avi Chawla
@_avichawla
2️⃣ Embed transcripts and store them in a vector database:

To do this, we:
- Load the embedding model and generate embeddings.
- Connect to Qdrant and create a collection.
- Store the embeddings

Look at this implementation👇
Thread image
Avi Chawla
@_avichawla
3️⃣ Retrieval

Now, we query the vector database to retrieve sentences in the transcripts that are similar to the query:

- Convert the query into an embedding.
- Search the vector database.
- Retrieve the top results.

Here's the code 👇
Thread image
Avi Chawla
@_avichawla
4️⃣ Generate response

Finally, after retrieving the context:
- We construct a prompt.
- We use DeepSeek-R1 through @ollama to generate a response.

Look at this implementation👇
Thread image
Avi Chawla
@_avichawla
5️⃣ Streamlit UI

To make this accessible, we wrap the entire app in a @Streamlit interface.

It’s a simple UI where you can upload and chat with the audio file directly.

Here's the demo again👇
Video thumbnail
VIDEO
Avi Chawla
@_avichawla
That's a wrap!

If you enjoyed this tutorial:

Find me → @_avichawla

Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press + S to quick-export