name: remotion-srt-video-builder description: Build narrated Remotion videos from SRT subtitles, matching MP3/WAV voiceover, transcripts, or timed caption files. Use when Codex needs to parse subtitles, infer scene breaks from spoken context, choose or ask for a visual style preset, create style-plan.json and scene-plan.json, generate synchronized Remotion scenes/components, add audio and captions with staticFile(), and render MP4 product demos, explainers, walkthroughs, lessons, launch videos, or founder-led demos.
Remotion SRT Video Builder
Version: v0.0.1
Overview
Turn public/subtitles.srt plus matching narration audio into a synchronized Remotion MP4. Treat the SRT as the timing/context source and style-plan.json as the visual source, then create a scene-plan.json and generate Remotion scenes that match both the spoken content and chosen style.
Use this as a workflow skill above ordinary Remotion best practices. It decides what to show from the SRT, while Remotion rules decide how to implement it.
Default input layout
Prefer this layout inside the target Remotion project:
public/
narration.mp3
subtitles.srt
Accept variants if the user provides them:
public/audio/*.mp3
public/captions/*.srt
transcript.srt
voiceover.wav
Optional context files:
brand.md
homepage.html
screenshots/
figma-notes.md
style-reference.png
Core workflow
- Locate the Remotion project root. Confirm
package.json,src/Root.tsx, andpublic/when possible. - Locate SRT and narration audio. If paths are ambiguous, inspect
public/first. - Parse SRT to timed cue JSON using
scripts/parse_srt_to_json.py. - Check audio/SRT length agreement using
scripts/check_audio_srt_sync.pywhen audio is available. - Select visual style before UI generation. Read
references/style-presets.md,references/style-questionnaire.md, and usereferences/user-facing-templates.mdwhen asking the user. - Create
style-plan.jsonusingscripts/create_style_plan.py, or extract brand colors when reliable brand context exists. - Create a draft
scene-plan.jsonusingscripts/create_scene_plan.py. - Read
references/scene-segmentation.mdand refine scene boundaries by meaning, not only time. - Read
references/visual-archetypes.mdand assign each scene a visual type. - Generate or update Remotion components from both
style-plan.jsonandscene-plan.json. - Render audio with
staticFile("narration.mp3")or the actual public-relative path. - Render captions from SRT/JSON and keep them frame-synced.
- Run lint/typecheck, render MP4, then QA using
references/qa-checklist.md.
Style plan is required before UI generation
Create or update a style plan before writing Remotion UI code:
src/generated/style-plan.json
Use assets/style-plan.schema.json as the target shape. Run scripts/validate_style_plan.py before implementation.
If the user gives no brand or style information:
- Do not reuse colors from previous unrelated prompts.
- Offer the presets from
references/style-presets.mdwhen visual direction matters. - If the user asks to proceed automatically, use
calm-saas.
Style presets:
calm-saas- neutral B2B default.premium-startup- polished dark launch video.friendly-founder-demo- warm founder-led customer demo.enterprise-polished- serious sales/procurement tone.energetic-social- fast ads/Reels.editorial-documentary- lessons, reports, case studies.use-brand-colors- use only when brand colors are supplied or extractable.
Scene plan is the source of truth
Always create or update a scene plan before writing large Remotion code:
src/generated/scene-plan.json
Each scene should include:
idstartMsendMssummaryspokenTextvisualTypeonScreenTextactionscaptionCueIndexes
Use assets/scene-plan.schema.json as the target shape. Run scripts/validate_scene_plan.py before implementation.
Question policy
If the user asks for automatic generation from SRT/audio, infer a reasonable first draft instead of blocking. Use calm-saas as the safe default style unless the user provides a style, brand context, or asks to be questioned.
When the user needs guidance, provide a copyable template from references/user-facing-templates.md instead of improvising a new questionnaire. Use the minimal intake template for new projects, the style picker before color decisions, and the scene approval template before heavy implementation for long/customer-facing videos.
Ask concise questions only when any of these are missing and materially affect the result:
- target format or aspect ratio,
- visual mood/style preset,
- product/brand identity,
- real UI vs mock UI fidelity,
- final CTA,
- whether to use voiceover, captions, or both.
If the user says to “grill me with questions,” ask product-demo questions before coding.
Remotion implementation rules
Read references/remotion-generation-rules.md and the chosen style preset before generating code.
Non-negotiables:
- Use frame-based animation:
useCurrentFrame(),useVideoConfig(),interpolate(),spring(),Sequence. - Use
staticFile()for files inpublic/. - Do not use
setTimeout, CSS animations, runtime randomness, or interactive event handlers to drive video state. - Split long videos into scene components.
- Keep captions legible and synced with SRT cue timing.
- Keep visual changes frequent enough for long narration; avoid static screens over 8-12 seconds unless intentional.
Useful commands
Parse captions:
python3 path/to/skill/scripts/parse_srt_to_json.py public/subtitles.srt --out src/generated/captions.json
Create style plan:
python3 path/to/skill/scripts/create_style_plan.py --preset calm-saas --out src/generated/style-plan.json
python3 path/to/skill/scripts/validate_style_plan.py src/generated/style-plan.json
Create draft scene plan:
python3 path/to/skill/scripts/create_scene_plan.py src/generated/captions.json --audio narration.mp3 --out src/generated/scene-plan.json
Validate plan:
python3 path/to/skill/scripts/validate_scene_plan.py src/generated/scene-plan.json
Check sync:
python3 path/to/skill/scripts/check_audio_srt_sync.py public/subtitles.srt public/narration.mp3
Render:
npm run lint
npx remotion render SrtDrivenVideo out/srt-driven-video.mp4 --overwrite
ffprobe -v error -show_entries format=duration out/srt-driven-video.mp4
Templates
Use references/user-facing-templates.md for copyable user intake, style picker, scene approval, and ready-to-run prompt templates.
Use files in assets/templates/ as code starting points when the target project lacks equivalents:
SrtDrivenVideo.tsx- main composition reading a scene plan, style plan, and rendering audio/captions.CaptionOverlay.tsx- frame-synced caption overlay for pre-parsed cues.scene-components.tsx- simple style-aware visual archetype components to replace or customize.
Final response checklist
When finished, report:
- input SRT/audio paths,
- generated style-plan path,
- generated scene-plan path,
- selected style preset and accent color,
- key Remotion files changed,
- render output path/link,
- duration/resolution/FPS verification,
- any assumptions or remaining manual review points.