heygen-com/video-download

Download video and audio from YouTube and 1000+ sites using yt-dlp. No API keys needed. Use when: (1) Downloading a video from YouTube or other sites, (2) Extracting audio from a video URL, (3) Downloading subtitles/captions from a video, (4) Getting video metadata without downloading.

兼容平台~Claude Code~Codex CLI~Cursor

npx skills add https://github.com/heygen-com/skills/tree/main/skills/video-download

查看原文→浏览所有技能

Ask in your favorite AI

Open a new chat with this agent skill pre-loaded.

ChatGPT Claude Gemini Grok Perplexity DeepSeek

文档

heygen-com/video-download

Individual skills in this repo

This repo contains 9 individual skills — each has its own dedicated page.

heygen-com/ai-video-gen

Generate AI videos from text prompts using the HeyGen API. Use when: (1) Generating videos from text descriptions, (2) Creating AI-generated video clips for content production, (3) Image-to-video generation with a reference image, (4) Choosing between video generation providers (VEO, Kling, Sora, Runway, Seedance), (5) Working with HeyGen's /v1/workflows/executions endpoint for video generation.

heygen-com/avatar-video

Create AI avatar videos with precise control over avatars, voices, scripts, and backgrounds using HeyGen's v3 API (POST /v3/videos). Two modes: type="avatar" with avatar_id, or type="image" with an image AssetInput (Avatar IV). Use when: (1) Choosing a specific avatar and voice for a video, (2) Writing exact scripts for an avatar to speak, (3) Animating a photo into a speaking video (type="image"), (4) Transparent background videos with remove_background, (5) Integrating HeyGen avatars with Remotion, (6) Batch video generation with exact specs, (7) Brand-consistent production videos with precise control.

heygen-com/create-video

Create videos from a text prompt using HeyGen's Video Agent (POST /v3/video-agents). The default for most video requests — AI handles script, avatar, visuals, voiceover, and captions automatically. Use when: (1) Creating a video from a description or idea, (2) Generating explainer, demo, or marketing videos from a prompt, (3) Making a video without specifying exact avatars, voices, or scenes, (4) Quick video prototyping or drafts, (5) User says "make me a video" or "create a video about X". For precise control over specific avatars, exact scripts, or per-scene configuration, use the avatar-video skill instead.

heygen-com/faceswap

Swap faces in a video using AI via the HeyGen API. Use when: (1) Replacing a face in a video with another face, (2) Face swapping from a source image onto a target video, (3) Creating personalized videos by swapping in a person's face, (4) Working with HeyGen's /v1/workflows/executions endpoint for face swap processing.

heygen-com/heygen

[DEPRECATED] Use `create-video` for prompt-based video generation or `avatar-video` for precise avatar/scene control. This legacy skill combines both workflows — the newer focused skills provide clearer guidance.

heygen-com/heygen-video

Generate HeyGen presenter videos via the v3 Video Agent pipeline — handles Frame Check (aspect ratio correction), prompt engineering, avatar resolution, and voice selection. Required for any HeyGen video generation. Replaces deprecated endpoints with v3. Use when: (1) generating any HeyGen video (via API or otherwise), (2) sending a personalized video message (outreach, update, announcement, pitch, knowledge), (3) creating a HeyGen presenter-led explainer, tutorial, or product demo with a human face, (4) "make a video of me saying...", "send a video to my leads", "record an update for my team", "create a video pitch", "make a loom-style message", "I want to appear in this video", "generate a HeyGen video", "make a talking head video". Accepts avatar_id from heygen-avatar for identity-first HeyGen videos, or uses a stock presenter. Returns video share URL + HeyGen session URL for iteration. Chain signal: when the user wants to create/design an avatar AND make a video in the same request, run heygen-avatar firs

heygen-com/video-translate

Translate and dub existing videos into multiple languages using HeyGen. Use when: (1) Translating a video into another language, (2) Dubbing video content with lip-sync, (3) Creating multi-language versions of existing videos, (4) Audio-only translation without lip-sync, (5) Working with HeyGen's /v2/video_translate endpoint.

heygen-com/video-understand

Understand video content locally using ffmpeg frame extraction and Whisper transcription. No API keys needed. Use when: (1) Understanding what a video contains, (2) Transcribing video audio locally, (3) Extracting key frames for visual analysis, (4) Getting video content without API keys.

heygen-com/visual-style

Create, extract, and apply portable visual design systems via visual-style.md files. Use when: (1) Creating a visual-style.md design system from scratch, (2) Extracting a visual style from a website URL, video, or PDF brand guide, (3) Applying a visual style to HeyGen videos, HTML slides, Figma, or paper.design, (4) Browsing the gallery of pre-built visual styles (Swiss, Saul Bass, Game Boy, etc.), (5) User mentions "visual style", "design system", "brand style", or "style guide", (6) Styling a HeyGen video with a consistent design language.

相关技能

teaxus/gemini-video-tutor

用 Gemini 分析视频，产出结构化、可溯源的 Markdown 文档。支持操作教程生成、多轮深度追问、断点续传。Agent Skill for Claude/Codex/etc.

community

scowlsericulturist188/claude-auto-tok

Automate TikTok video creation with six AI agents for trend research, scripts, voiceovers, B-roll, rendering, and QA

community

doany-ai/lipsync

Lip-sync a face to a specific audio track on RunComfy via the `runcomfy` CLI. Routes across ByteDance OmniHuman (audio-driven full-body avatar from a portrait + audio), Sync Labs sync v2 / Pro (state-of-the-art mouth sync onto a video), Kling lipsync (audio-to- video and text-to-video with synced speech), and Creatify lipsync. The skill picks the right endpoint for the user's actual intent — portrait still + audio (avatar-style), source video + audio (mouth- swap on existing footage), or generate-and-sync from a script. Triggers on "lip sync", "lipsync", "make this video speak", "match audio to mouth", "dub video", "sync lips to voice", "Sync Labs", "voiceover sync", or any explicit ask to drive a face's mouth from an audio track.

community

lottiefiles/motion-design

Applies motion design principles to create emotionally-driven, technically sound animations and transitions. Provides timing, easing, choreography, and Disney animation principles adapted for UI. Use when creating animations, transitions, micro-interactions, loading states, page transitions, scroll-triggered effects, or any motion work. Works with CSS, Framer Motion, GSAP, Lottie, Spring, or any animation system.

community

affaan-m/videodb

See, Understand, Act on video and audio. See- ingest from local files, URLs, RTSP/live feeds, or live record desktop; return realtime context and playable stream links. Understand- extract frames, build visual/semantic/temporal indexes, and search moments with timestamps and auto-clips. Act- transcode and normalize (codec, fps, resolution, aspect ratio), perform timeline edits (subtitles, text/image overlays, branding, audio overlays, dubbing, translation), generate media assets (image, audio, video), and create real time alerts for events from live streams or desktop capture.

community

Ajain0311/agent-craft-ai-skill-hub

A collaborative web platform for crafting, organizing, and sharing AI agent skills, prompts, and complex agentic workflows. It empowers teams to build a structured knowledge base of effective AI interactions, fostering reusability and accelerating agent development.

community

← More 视频与动画 skills