Visualize Thread by @EXM7777 | Thread Navigator

✨ Visual Editor

warning

Thread Truncated

Only the first 20 tweets are shown to ensure high-quality rendering and prevent image size issues.

Presets

Custom Gradient

arrow_forward

Gradient Angle135°

Background Pattern

Grain Texture

Aspect Ratio

Card Style

Padding40px

Card Radius16px

Enable Card Shadow

Glassmorphism Effect

Show Watermark PRO

Show Timestamps

Show X Logo

Font Family

Font Size16px

Machina

@EXM7777

how to make AI videos so convincing people will question if they're real, using Sora 2 and JSON prompting:

Machina

@EXM7777

you're writing your Sora 2 prompts like you're writing an essay...

"create a cinematic video of a sunset over mountains with dramatic lighting and smooth camera movement"

it (barely) works... but you're leaving so much quality and realism on the table

here's the process that changed everything for me:

Machina

@EXM7777

before we get into the real stuff...

if you're serious about learning ai, here are some very good resources:

more free prompts + content in my telegram (link in bio)

weekly newsletter (no ads/spam): aifirstbrain.com

now back to the thread

Machina

@EXM7777

JSON (JavaScript Object Notation) is just structured data... and before you roll your eyes thinking this sounds technical, stay with me because this is simpler than you think

instead of writing paragraphs hoping the AI interprets correctly, you're organizing instructions the way the model actually processes them

like filling out a form vs explaining what you want in a rambling email

Machina

@EXM7777

same idea as that essay prompt... but now every parameter has its own label

> no ambiguity about what "dramatic" modifies
> no confusion about relationships between elements
> just clean, organized instructions

Machina

@EXM7777

why this creates photorealistic results:

Sora 2 doesn't have to waste processing power parsing grammar and inferring meaning... it reads structured key-value pairs directly

which means more computing power goes to actually GENERATING the video instead of understanding your prompt

this is why JSON outputs consistently look more polished

Machina

@EXM7777

and here's something nobody talks about... token efficiency

when you write "create a cinematic video with dramatic lighting, smooth camera movement, and a sunset over mountains" the AI processes every single word, punctuation mark, grammatical structure

JSON skips 90% of that linguistic overhead

same information, fraction of the tokens

Machina

@EXM7777

but Sora 2 is fundamentally different from image generators...

(this is where most people's mental model breaks)

it doesn't generate a pretty picture and add motion... it actually understands how scenes EVOLVE over time

physics, momentum, cause and effect

which means you need to prompt temporally, not just spatially

Machina

@EXM7777

here's what I mean by temporal prompting:

{
"duration": "10s",
"sequence": [
{"time": "0-3s", "action": "camera zooms in on subject"},
{"time": "3-7s", "action": "subject turns head slowly"},
{"time": "7-10s", "action": "fade to black"}
]
}

you're literally choreographing a timeline

Machina

@EXM7777

this shift from spatial to temporal thinking is HUGE

images = "what's in the frame"
videos = "what happens WHEN in the frame"

once you internalize this... your video quality jumps dramatically because you're finally speaking the model's language instead of fighting against how it actually works

Machina

@EXM7777

let me break down the 5 components every photorealistic Sora 2 prompt needs:

1. scene description (spatial)
2. camera parameters (perspective)
3. motion/action (temporal)
4. lighting/atmosphere (mood)
5. temporal structure (pacing)

miss any of these and you get that "AI video" look everyone recognizes

Machina

@EXM7777

here's a scene description done right:

{
"subject": "elderly craftsman in workshop",
"environment": "cluttered wooden workbench with tools",
"objects": ["vintage hand saw", "wood shavings", "half-finished chair"],
"composition": "medium shot, rule of thirds"
}

specific spatial relationships... not vague descriptions like "a nice workshop scene"

Machina

@EXM7777

amera parameters (this is where cinematography knowledge pays off):

{
"camera": {
"angle": "eye level, slight dutch tilt",
"movement": "slow dolly left to right",
"lens": "35mm equivalent, shallow depth of field",
"focus": "subject sharp, background soft bokeh"
}
}

Sora 2 understands real cinematography language... use it

Machina

@EXM7777

motion and action:

{
"motion": {
"primary": "hands carefully sanding wood grain",
"secondary": "dust particles floating through light beam",
"tertiary": "workshop fan oscillating in background",
"pace": "calm, meditative"
}
}

layers of motion at different speeds create depth and realism... single-layer motion looks flat and fake

Machina

@EXM7777

lighting creates emotion (and believability):

{
"lighting": {
"source": "single window, late afternoon",
"direction": "45 degrees camera left",
"quality": "soft directional with visible god rays",
"color_temp": "warm 3200K",
"mood": "nostalgic, contemplative"
}
}

real scenes have motivated lighting... random "good lighting" screams AI

Machina

@EXM7777

temporal structure ties everything together:

{
"timeline": {
"0-2s": "establish wide shot of workshop",
"2-6s": "push in to medium shot, focus on hands working",
"6-8s": "rack focus to craftsman's concentrated face",
"8-10s": "pull back revealing finished piece, soft smile"
}
}

this is narrative pacing... not just "make a 10 second video"

Machina

@EXM7777

now compare JSON to natural language for the same prompt...

"create a video of an elderly craftsman in a cluttered workshop with vintage tools and wood shavings, late afternoon window light from the left creating soft god rays, camera slowly dollying left to right at eye level with 35mm lens and shallow depth of field, hands carefully sanding wood while dust floats and a fan oscillates, starting wide then pushing to medium then racking focus to face then pulling back to reveal finished work..."

see how it becomes an unreadable mess?

Machina

@EXM7777

JSON keeps complex prompts organized:

{
"scene": {...},
"camera": {...},
"motion": {...},
"lighting": {...},
"timeline": {...}
}

everything nested logically
nothing ambiguous
infinitely more maintainable

and here's the real power move... you can save these as templates and swap values

Machina

@EXM7777

template-based workflow:

{
"scene": {
"subject": "{{SUBJECT}}",
"environment": "{{ENVIRONMENT}}",
"objects": ["{{OBJ1}}", "{{OBJ2}}", "{{OBJ3}}"]
},
"camera": {{CAMERA_PRESET_CINEMATIC}},
"lighting": {{LIGHTING_PRESET_NATURAL}}
}

systematic video generation instead of starting from scratch every time... this is AI-First thinking

Machina

@EXM7777

Sora 2-specific advantages you need to leverage:

- better physics understanding (fabric, water, smoke all behave realistically)
- superior multi-subject consistency (characters maintain visual identity across cuts)
- accurate reflections and shadows (environmental lighting actually works)

but you have to PROMPT for these... they're not automatic

Generated by Thread Navigator

100%

workspace_premium Upgrade

Press ⌘ + S to quick-export

auto_awesome

Image exported!

Pro export renders embedded tweets & media at 2x Retina resolution.

Upgrade — $5 for 30 days

✨ Visual Editor

palette Canvas & Background

style Card Style

text_fields Typography