Carousel Studio

Repurpose X Threads into LinkedIn & Instagram Carousels

Canvas & Ratio

Choose your destination platform format


Layout Template

Choose a content structure for your slides


Preset Themes


Typography & Sizing

Title Font Size36px
Body Font Size18px
Header & Footer Size12px

Brand Kit Customization

AGENCY

Configure brand assets for headers & footers

MULTI-PROFILES (AGENCY)
AGENCY
SAVE PRESETS (AGENCY)

Outro Slide CTA

Customize your closing call-to-action slide

#1
#2
#3

Background Pattern

Source Content

Build Your Carousel

Drag and drop any post card below onto a slide, or use the quick buttons to insert content/images instantly!

Drag Post #1
Akshay 🚀
@akshay_pachaar

Let's build a real-time Voice RAG Agent, step-by-step:

Drag Post #2
Akshay 🚀
@akshay_pachaar

Before we begin, here's a quick demo of what we're building Tech stack: - @Cartesia_AI for SOTA text-to-speech - @AssemblyAI for speech-to-text - @LlamaIndex to power RAG - @livekit for orchestration Let's go! 🚀

VIDEO
Apply Image
Drag Post #3
Akshay 🚀
@akshay_pachaar

Here's an overview of what the app does: 1. Listens to real-time audio 2. Transcribes it via AssemblyAI 3. Uses your docs (via LlamaIndex) to craft an answer 4. Speaks that answer back with Cartesia Now let's jump into code!

Drag Post #4
Akshay 🚀
@akshay_pachaar

1️⃣ Set up environment and logging This ensures we can load configurations from .env and keep track of everything in real time. Check this out👇

Apply Image
Drag Post #5
Akshay 🚀
@akshay_pachaar

2️⃣ Setup RAG This is where your documents get indexed for search and retrieval, powered by LlamaIndex. The agents answers would be grounded to this knowledge base. Check this out👇

Apply Image
Drag Post #6
Akshay 🚀
@akshay_pachaar

3️⃣ Setup Voice Activity Detection We also want Voice Activity Detection (VAD) for smooth real-time experience—so we’ll “prewarm” the Silero VAD model. This helps us detect when someone is actually speaking. Check this out👇

Apply Image
Drag Post #7
Akshay 🚀
@akshay_pachaar

4️⃣ The VoicePipelineAgent and Entry Point This is where we bring it all together. The agent: 1. Listens to real-time audio. 2. Transcribes it using AssemblyAI. 3. Crafts an answer with your documents via LlamaIndex. 4. Speaks that answer back using Cartesia. Check this out 👇

Apply Image
Drag Post #8
Akshay 🚀
@akshay_pachaar

5️⃣ Run the app Finally, we tie it all together. We run our agent with, specifying the prewarm function and main entrypoint. That’s it—your Real-Time Voice RAG Agent is ready to roll!

Apply Image
Drag Post #9
Akshay 🚀
@akshay_pachaar

The entire code is 100% open-source, you can find it here! GitHub repo: <a target="_blank" href="https://github.com/patchy631/ai-engineering-hub/tree/main/rag-voice-agent" color="blue">github.com/patchy631/ai-e…</a>

Drag Post #10
Akshay 🚀
@akshay_pachaar

That's a wrap! If you enjoyed this breakdown: Follow me → @akshay_pachaar ✔️ Every day, I share insights and tutorials on LLMs, AI Agents, RAGs, and Machine Learning!