pexoai/videoagent-audio-studio
Tired of juggling multiple audio APIs? This skill gives you one-command access to TTS, music generation, sound effects, and voice cloning. Use when you want to generate any audio without managing multiple API keys.
Tired of juggling multiple audio APIs? This skill gives you one-command access to TTS, music generation, sound effects, and voice cloning. Use when you want to generate any audio without managing multiple API keys.
npx skills add https://github.com/pexoai/pexo-skills/tree/main/skills/videoagent-audio-studioTired of juggling multiple audio APIs? This skill gives you one-command access to TTS, music generation, sound effects, and voice cloning. Use when you want to generate any audio without managing multiple API keys.
This repo contains 4 individual skills — each has its own dedicated page.
AI video generation skill with auto model selection across Seedance 2, Kling 3.0, HappyHorse, and 10+ models. Produces finished multi-shot videos (5–120s) from text, images, URLs, scripts, or audio — including AI music, lip sync, and multi-shot sequencing. No prompts to write, no models to choose. USE FOR: video production, AI video, make a video, product video, brand video, promotional clip, explainer video, short video, TikTok video, Instagram Reel, YouTube Short, product ad, text-to-video, image-to-video, video generation, AI video agent.
Expert prompt engineering for Seedance 2.0. Use when the user wants to generate a video with multimodal assets (images, videos, audio) and needs the best possible prompt.
Tired of juggling 8 API keys? This skill gives you one-command access to Midjourney, Flux, Ideogram, and more, with zero setup. Use when you want to generate any image without worrying about API keys.
Generate short AI videos from text or images — text-to-video, image-to-video, and reference-based generation — with zero API key setup. Use when the user wants to create a video clip, animate an image, or generate video from a description.
Instant voice cloning pipeline extensions allowing conversational agents to clone accent styles dynamically via brief audio prompts.
OpenClaw skill: 从YouTube或本地音频文件分离人声和伴奏,生成纯伴奏音乐
Generate videos using the xAI Grok Imagine API with this local CLI tool for prompt management, status polling, and automatic media downloads.
Daily video production for anyone too busy to record and edit but whose audience still expects to hear from them. Orchestrates Remotion + ElevenLabs + Vertex Veo + Three.js as a single Claude Code skill, with a built-in marketing HQ dashboard.
OpenClaw skill for fast Bilibili video summary material collection
Kling 3.0 video generation on RunComfy. Kling 3.0 (also called Kling V3.0) is Kuaishou Technology's third-generation multi-shot video model with native synchronized audio and consistent character identity across shots. This skill covers all six Kling 3.0 endpoints, spanning three rendering tiers (Standard, Pro, 4K) and two modes (text-to-video, image-to-video). Calls runcomfy run kling/kling-3.0/<tier>/<mode> through the local RunComfy CLI. Triggers on "kling", "kling 3.0", "kling v3", "kling pro", "kling 4k", "kling text to video", "kling image to video", or any explicit ask to generate or animate with Kling 3.0.