CommunityVideo y animacióngithub.com

coreline-ai/remotion-srt-video-builder

SRT 자막·MP3/WAV 음성을 style/scene plan 기반 Remotion MP4로 변환하는 Codex Skill | Codex Skill that turns SRT subtitles and MP3/WAV narration into style/scene-plan driven Remotion MP4 videos

Compatible con~Claude CodeCodex CLI~Cursor
npx add-skill coreline-ai/remotion-srt-video-builder

name: remotion-srt-video-builder description: Build narrated Remotion videos from SRT subtitles, matching MP3/WAV voiceover, transcripts, or timed caption files. Use when Codex needs to parse subtitles, infer scene breaks from spoken context, choose or ask for a visual style preset, create style-plan.json and scene-plan.json, generate synchronized Remotion scenes/components, add audio and captions with staticFile(), and render MP4 product demos, explainers, walkthroughs, lessons, launch videos, or founder-led demos.

Remotion SRT Video Builder

Version: v0.0.1

Overview

Turn public/subtitles.srt plus matching narration audio into a synchronized Remotion MP4. Treat the SRT as the timing/context source and style-plan.json as the visual source, then create a scene-plan.json and generate Remotion scenes that match both the spoken content and chosen style.

Use this as a workflow skill above ordinary Remotion best practices. It decides what to show from the SRT, while Remotion rules decide how to implement it.

Default input layout

Prefer this layout inside the target Remotion project:

public/
  narration.mp3
  subtitles.srt

Accept variants if the user provides them:

public/audio/*.mp3
public/captions/*.srt
transcript.srt
voiceover.wav

Optional context files:

brand.md
homepage.html
screenshots/
figma-notes.md
style-reference.png

Core workflow

  1. Locate the Remotion project root. Confirm package.json, src/Root.tsx, and public/ when possible.
  2. Locate SRT and narration audio. If paths are ambiguous, inspect public/ first.
  3. Parse SRT to timed cue JSON using scripts/parse_srt_to_json.py.
  4. Check audio/SRT length agreement using scripts/check_audio_srt_sync.py when audio is available.
  5. Select visual style before UI generation. Read references/style-presets.md, references/style-questionnaire.md, and use references/user-facing-templates.md when asking the user.
  6. Create style-plan.json using scripts/create_style_plan.py, or extract brand colors when reliable brand context exists.
  7. Create a draft scene-plan.json using scripts/create_scene_plan.py.
  8. Read references/scene-segmentation.md and refine scene boundaries by meaning, not only time.
  9. Read references/visual-archetypes.md and assign each scene a visual type.
  10. Generate or update Remotion components from both style-plan.json and scene-plan.json.
  11. Render audio with staticFile("narration.mp3") or the actual public-relative path.
  12. Render captions from SRT/JSON and keep them frame-synced.
  13. Run lint/typecheck, render MP4, then QA using references/qa-checklist.md.

Style plan is required before UI generation

Create or update a style plan before writing Remotion UI code:

src/generated/style-plan.json

Use assets/style-plan.schema.json as the target shape. Run scripts/validate_style_plan.py before implementation.

If the user gives no brand or style information:

  1. Do not reuse colors from previous unrelated prompts.
  2. Offer the presets from references/style-presets.md when visual direction matters.
  3. If the user asks to proceed automatically, use calm-saas.

Style presets:

  • calm-saas - neutral B2B default.
  • premium-startup - polished dark launch video.
  • friendly-founder-demo - warm founder-led customer demo.
  • enterprise-polished - serious sales/procurement tone.
  • energetic-social - fast ads/Reels.
  • editorial-documentary - lessons, reports, case studies.
  • use-brand-colors - use only when brand colors are supplied or extractable.

Scene plan is the source of truth

Always create or update a scene plan before writing large Remotion code:

src/generated/scene-plan.json

Each scene should include:

  • id
  • startMs
  • endMs
  • summary
  • spokenText
  • visualType
  • onScreenText
  • actions
  • captionCueIndexes

Use assets/scene-plan.schema.json as the target shape. Run scripts/validate_scene_plan.py before implementation.

Question policy

If the user asks for automatic generation from SRT/audio, infer a reasonable first draft instead of blocking. Use calm-saas as the safe default style unless the user provides a style, brand context, or asks to be questioned.

When the user needs guidance, provide a copyable template from references/user-facing-templates.md instead of improvising a new questionnaire. Use the minimal intake template for new projects, the style picker before color decisions, and the scene approval template before heavy implementation for long/customer-facing videos.

Ask concise questions only when any of these are missing and materially affect the result:

  1. target format or aspect ratio,
  2. visual mood/style preset,
  3. product/brand identity,
  4. real UI vs mock UI fidelity,
  5. final CTA,
  6. whether to use voiceover, captions, or both.

If the user says to “grill me with questions,” ask product-demo questions before coding.

Remotion implementation rules

Read references/remotion-generation-rules.md and the chosen style preset before generating code.

Non-negotiables:

  • Use frame-based animation: useCurrentFrame(), useVideoConfig(), interpolate(), spring(), Sequence.
  • Use staticFile() for files in public/.
  • Do not use setTimeout, CSS animations, runtime randomness, or interactive event handlers to drive video state.
  • Split long videos into scene components.
  • Keep captions legible and synced with SRT cue timing.
  • Keep visual changes frequent enough for long narration; avoid static screens over 8-12 seconds unless intentional.

Useful commands

Parse captions:

python3 path/to/skill/scripts/parse_srt_to_json.py public/subtitles.srt --out src/generated/captions.json

Create style plan:

python3 path/to/skill/scripts/create_style_plan.py --preset calm-saas --out src/generated/style-plan.json
python3 path/to/skill/scripts/validate_style_plan.py src/generated/style-plan.json

Create draft scene plan:

python3 path/to/skill/scripts/create_scene_plan.py src/generated/captions.json --audio narration.mp3 --out src/generated/scene-plan.json

Validate plan:

python3 path/to/skill/scripts/validate_scene_plan.py src/generated/scene-plan.json

Check sync:

python3 path/to/skill/scripts/check_audio_srt_sync.py public/subtitles.srt public/narration.mp3

Render:

npm run lint
npx remotion render SrtDrivenVideo out/srt-driven-video.mp4 --overwrite
ffprobe -v error -show_entries format=duration out/srt-driven-video.mp4

Templates

Use references/user-facing-templates.md for copyable user intake, style picker, scene approval, and ready-to-run prompt templates.

Use files in assets/templates/ as code starting points when the target project lacks equivalents:

  • SrtDrivenVideo.tsx - main composition reading a scene plan, style plan, and rendering audio/captions.
  • CaptionOverlay.tsx - frame-synced caption overlay for pre-parsed cues.
  • scene-components.tsx - simple style-aware visual archetype components to replace or customize.

Final response checklist

When finished, report:

  • input SRT/audio paths,
  • generated style-plan path,
  • generated scene-plan path,
  • selected style preset and accent color,
  • key Remotion files changed,
  • render output path/link,
  • duration/resolution/FPS verification,
  • any assumptions or remaining manual review points.

Skills relacionados