Hello Directors,
We have spent three weeks building our visual world. We have a Character, a Setting, and we’ve mastered Continuity.
But if you watch your sequences back, something is probably missing. They look great, but they feel... empty.
That’s because a character isn’t truly alive until they speak.
For a long time, AI video was stuck in the “Silent Movie” era. Adding dialogue meant complex lip-syncing tools and hours of editing. But that has changed.
In this week’s video, I show you the two ways I handle dialogue: The Native Way (for speed) and The Dubbing Way (for control).
Three weeks ago, we kicked off the Zero to Director Challenge. The goal is to build an AI video together, step-by-step.
This is an evergreen challenge, so you can join at any time.
This week, we’re tackling Challenge 4: Adding Voices
Method 1: The Native Way (Veo 3.1, Sora, LTX-2)
The newest generation of models understands audio as well as video. If you are using Veo 3.1, Sora 2, or LTX-2, you don’t need external tools.
The Golden Rule: If you put it in quotes, they will say it.
In my current project, my character Bea has just found the robot. I want her to react with shock.
I use the same “Ingredients” workflow from last week (Reference Image of Bea + Last Frame of the scene), and then I use this prompt:
“The lights of the Robot on the ground turn on, it comes alive, and attempts to sit down. [cut] Close up of woman shocked. She screams “You are alive!” and hugs the robot.”
The Result: The AI generates the video, creates the voice, and animates the mouth movement automatically.
The Downside: You get what you get. You cannot control the tone or the accent. Sometimes your gritty scavenger sounds like a cheerful teenager.
Method 2: The “Dubbing” Way (For Total Control)
If you need your character to sound the same across multiple shots (which you should!), the Native Method often fails.
Here is the “Dubbing” workflow I use to ensure consistency.
Step 1: Generate the Audio Go to an AI Voice generator like ElevenLabs. Create a specific “Voice Clone” or pick a preset that fits your character. Type your line (“You are alive!”) and download the audio file.
Step 2: The Edit Open your editor (like CapCut).
Import your video clip.
Lower the volume of the original video track to 5-10% (so you keep the background ambience but lose the random AI voice).
Overlay your ElevenLabs audio track exactly where the character opens their mouth.
Pro Tip: For short lines (1-3 seconds), this simple overlay usually looks perfect. If you have a long monologue, you might need a dedicated Lip Sync tool like Kling, but for punchy dialogue, this “Dubbing” trick is faster and gives you better acting.
Your Mission for the Week
It’s time to break the silence.
Take your sequence from last week. Pick one moment where your character speaks.
Try the Native Prompt method first. Does the voice fit?
If not, try the Dubbing method with a custom voice.
Paid Subscribers: Upload your clip to the Chat. Let’s hear those voices!
Free Subscribers: Post your results on Notes and tag me.
Let’s make some noise.






