AI TOOL · PRIVATE · CASE STUDY

LoRA character. Lip-synced jingle. B-roll. Final MP4.

AI video studio that assembles a complete branded video ad from a handful of reference photos — no crew, no camera.

Traditional ad production requires a crew, a shoot day, a post house, and weeks of calendar. Ad Studio replaces the entire pipeline with a five-step AI process. Give it a folder of reference photos and a brief — it delivers a polished branded video ad.

Work with this tool
PIPELINE

Five steps from photos to final cut.

STEP 01
LORA TRAINING
Lock in the Subject's Likeness

A FLUX LoRA character model is fine-tuned on the reference photos via FAL AI. The training locks in the subject's face, build, and distinguishing features — the same person appears consistently in every generated frame.

STEP 02
JINGLE GENERATION
Script, Lyrics, Voice Track

Claude writes the ad script and jingle lyrics from the brief. ElevenLabs generates the voice track complete with lip-sync timing metadata — every phoneme mapped to a millisecond timestamp for the sync step that follows.

STEP 03
B-ROLL GENERATION
Contextual Clips per Scene

The Kling video model on FAL generates contextual B-roll clips from scene descriptions in the script. Product in use, lifestyle moments, location shots — rendered to match the brand aesthetic without a single camera.

STEP 04
LIP-SYNC
Character Speaks the Jingle

The LoRA character is rendered speaking the jingle using ElevenLabs timing metadata — OmniHuman on FAL drives the facial animation. The result is the same recognizable subject, mouth moving in perfect sync with the generated voice track.

STEP 05
ASSEMBLY
ffmpeg Stitches the Final MP4

ffmpeg stitches the lip-synced anchor clip, B-roll clips, jingle audio, and lower-third text overlays into the final branded MP4 — ready for Facebook, Instagram, or broadcast, at any aspect ratio.

AI MODELS

Every model in the pipeline.

COPY & SCRIPT
CLAUDE

Ad script, jingle lyrics, scene-by-scene B-roll prompts, and lower-third copy — all generated from a single structured brief.

CHARACTER
FLUX LORA · FAL AI

FLUX LoRA fine-tuned on reference photos. Consistent subject likeness across all generated still and video frames.

B-ROLL VIDEO
KLING · FAL AI

Scene-level B-roll clip generation. Text-to-video with camera motion controls and scene duration targeting.

LIP-SYNC
OMNIHUMAN · FAL AI

Drives facial animation from ElevenLabs phoneme timing data. Renders the LoRA character speaking the jingle with accurate mouth movement.

VOICE
ELEVENLABS

Voice synthesis with phoneme-level timing metadata exported alongside the audio file for OmniHuman to consume downstream.

STACK

What it runs on.

Claude
Script & copy generation
FAL AI
LoRA, Kling, OmniHuman
ElevenLabs
Voice + timing data
ffmpeg
Final assembly
Next.js
Pipeline UI
PRIVATE TOOL

Need a video ad without
a production budget?

Ad Studio is used internally for client campaigns. Get in touch to discuss production access.

Start the conversation →