elevenlabs/speech-to-text

Transcribe audio to text using ElevenLabs Scribe v2. Use when converting audio/video to text, generating subtitles, transcribing meetings, or processing spoken content.

Compatible con~Claude Code~Codex CLI~Cursor

npx skills add https://github.com/elevenlabs/skills/tree/main/skills/speech-to-text

Ver original→Ver todas las habilidades

Ask in your favorite AI

Open a new chat with this agent skill pre-loaded.

ChatGPT Claude Gemini Grok Perplexity DeepSeek

Documentación

elevenlabs/speech-to-text

Transcribe audio to text using ElevenLabs Scribe v2. Use when converting audio/video to text, generating subtitles, transcribing meetings, or processing spoken content.

Individual skills in this repo

This repo contains 7 individual skills — each has its own dedicated page.

elevenlabs/agents

Build voice AI agents with ElevenLabs. Use when creating voice assistants, customer service bots, interactive voice characters, or any real-time voice conversation experience.

Generate music using ElevenLabs Music API. Use when creating instrumental tracks, songs with lyrics, background music, jingles, or any AI-generated music composition. Supports prompt-based generation, composition plans for granular control, and detailed output with metadata.

elevenlabs/setup-api-key

Guides users through setting up an ElevenLabs API key for ElevenLabs MCP tools. Use when the user needs to configure an ElevenLabs API key, when ElevenLabs tools fail due to missing API key, or when the user mentions needing access to ElevenLabs. First checks whether ELEVENLABS_API_KEY is already configured and valid, and only runs full setup when needed.

elevenlabs/sound-effects

Generate sound effects from text descriptions using ElevenLabs. Use when creating sound effects, generating audio textures, producing ambient sounds, cinematic impacts, UI sounds, or any audio that isn't speech. Supports looping, duration control, and prompt influence tuning.

elevenlabs/text-to-speech

Convert text to speech using ElevenLabs voice AI. Use when generating audio from text, creating voiceovers, building voice apps, or synthesizing speech in 70+ languages.

elevenlabs/voice-changer

Transform the voice in an audio recording into a different target voice while preserving emotion, timing, and delivery using the ElevenLabs Voice Changer (speech-to-speech) API. Use when converting one voice to another, changing the speaker/narrator of an existing recording, dubbing a voice-over in a different voice, creating character voices from a scratch performance, anonymizing a speaker, or any "voice conversion / voice transfer / speech-to-speech" task. Make sure to use this skill whenever the user mentions voice changing, voice conversion, speech-to-speech, swapping a voice in audio, re-voicing a clip, or applying a different voice to an existing recording — even if they don't explicitly say "voice changer".

elevenlabs/voice-isolator

Remove background noise and isolate vocals/speech from audio using ElevenLabs Voice Isolator (audio isolation) API. Use when cleaning up noisy recordings, removing music or background ambience from dialogue, isolating speech from field recordings, preparing audio for transcription, extracting vocals, or any "denoise / clean up / isolate voice" task.

Skills relacionados

Camille7585/polybridge-mcp

Connect Blender, n8n, and MCP with Polybridge MCP for fast 3D automation and workflow control

community

manat0912/HyperFrames-Studio

HyperFrames is an open-source framework for turning HTML, CSS, media, and seekable animations into deterministic MP4 videos. Use it locally with the CLI, from AI coding agents with skills, or as the rendering core behind hosted authoring workflows.

community

getsentry/presentation-creator

Create data-driven presentation slides using React, Vite, and Recharts with Sentry branding. Use when asked to "create a presentation", "build slides", "make a deck", "create a data presentation", "build a Sentry presentation". Scaffolds a complete slide-based app with charts, animations, and single-file HTML output.

community

jasonzhangshuo/solfege-video

🎵 简谱练习视频生成器 | MusicXML → MP4 (竖版/横版) | Cursor & Codex Skill

community

dabelmtz1323/typescript-react-patterns

Provide structured patterns for AI agents to write production-grade TypeScript, React, and Next.js code.

community

qianwen-ai/qianwen-image-generation

[QianWen] Generate and edit images using Wan and Qwen Image models. Supports text-to-image, image editing (style transfer, subject consistency, text rendering), and interleaved text-image output. TRIGGER when: user wants to create illustrations, product images, artistic designs, posters, text-to-image generation, edit/transform existing images, apply style transfer, generate images based on reference photos, interleaved text-image content, mentions Wan/Qwen Image models/AI art creation, or explicitly invokes this skill by name (e.g. use qianwen-image-generation). DO NOT TRIGGER when: user wants to understand/analyze existing images or OCR (use qianwen-vision), video generation (use qianwen-video-generation), text-only tasks.

community

← More Video y animación skills

Ask in your favorite AI

Documentación

elevenlabs/speech-to-text

Individual skills in this repo

elevenlabs/agents

elevenlabs/music

elevenlabs/setup-api-key

elevenlabs/sound-effects

elevenlabs/text-to-speech

elevenlabs/voice-changer

elevenlabs/voice-isolator

Skills relacionados

Camille7585/polybridge-mcp

manat0912/HyperFrames-Studio

getsentry/presentation-creator

jasonzhangshuo/solfege-video

dabelmtz1323/typescript-react-patterns

qianwen-ai/qianwen-image-generation