qianwen-ai/qianwen-vision

[QianWen] Understand images and videos with Qwen vision models. TRIGGER when: user wants to analyze, describe, or extract information from images or videos, OCR text extraction, chart/table reading, visual reasoning, multi-image comparison, screenshot understanding, video comprehension, or explicitly invokes this skill by name (e.g. use qianwen-vision). DO NOT TRIGGER when: user wants to generate/create images (use qianwen-image-generation), generate videos (use qianwen-video-generation), text-only tasks without visual input, or non-Qwen vision tasks.

対応~Claude Code~Codex CLI~Cursor
npx skills add https://github.com/qianwen-ai/qianwen-ai/tree/main/skills/qianwen-vision

Ask in your favorite AI

Open a new chat with this agent skill pre-loaded.

ドキュメント

qianwen-ai/qianwen-vision

[QianWen] Understand images and videos with Qwen vision models. TRIGGER when: user wants to analyze, describe, or extract information from images or videos, OCR text extraction, chart/table reading, visual reasoning, multi-image comparison, screenshot understanding, video comprehension, or explicitly invokes this skill by name (e.g. use qianwen-vision). DO NOT TRIGGER when: user wants to generate/create images (use qianwen-image-generation), generate videos (use qianwen-video-generation), text-only tasks without visual input, or non-Qwen vision tasks.

Individual skills in this repo

This repo contains 8 individual skills — each has its own dedicated page.

qianwen-ai/qianwen-audio-tts

[QianWen] Synthesize speech from text with Qwen TTS models. TRIGGER when: user wants to convert text to speech, create voiceovers, generate audio narration, read text aloud, build TTS applications, mentions speech synthesis/voice generation/audio output from text, or explicitly invokes this skill by name (e.g. use qianwen-audio-tts). DO NOT TRIGGER when: user wants speech recognition/ASR, text generation without audio, non-Qwen audio tasks.

qianwen-ai/qianwen-image-generation

[QianWen] Generate and edit images using Wan and Qwen Image models. Supports text-to-image, image editing (style transfer, subject consistency, text rendering), and interleaved text-image output. TRIGGER when: user wants to create illustrations, product images, artistic designs, posters, text-to-image generation, edit/transform existing images, apply style transfer, generate images based on reference photos, interleaved text-image content, mentions Wan/Qwen Image models/AI art creation, or explicitly invokes this skill by name (e.g. use qianwen-image-generation). DO NOT TRIGGER when: user wants to understand/analyze existing images or OCR (use qianwen-vision), video generation (use qianwen-video-generation), text-only tasks.

qianwen-ai/qianwen-model-selector

[QianWen] Recommend the best Qwen model and parameters. TRIGGER when: choosing between Qwen models, comparing Qwen model pricing, understanding Qwen model capabilities, checking usage or billing, viewing cost history, when an execution skill needs model selection advice, or user explicitly invokes this skill by name (e.g. use qianwen-model-selector). DO NOT TRIGGER when: non-Qwen model discussions (OpenAI, Gemini, etc.), general AI questions unrelated to Qwen.

qianwen-ai/qianwen-ops-auth

[QianWen] Configure authentication (API keys, endpoints). TRIGGER when: setting up QIANWEN_API_KEY, troubleshooting 401/auth errors, when another skill reports missing credentials, or user explicitly invokes this skill by name (e.g. use qianwen-ops-auth). DO NOT TRIGGER when: non-auth Qwen tasks, general API usage questions.

qianwen-ai/qianwen-text

[QianWen] Generate text, have conversations, write code, reason, and call functions with Qwen models. TRIGGER when: user asks to chat with Qwen, generate text, write code with Qwen, use Qwen function calling, or explicitly invokes this skill by name (e.g. use qianwen-text). DO NOT TRIGGER when: general coding questions without Qwen, non-Qwen AI model usage (OpenAI, Gemini, etc.), image/video understanding (use qianwen-vision), image/video/audio generation.

qianwen-ai/qianwen-update-check

[QianWen] Check for qianwen-ai updates and notify the user when a new version is available. TRIGGER when: user asks to check for updates, check version, asks 'is there a new version', 'latest version', 'update skills', 'check update', or any other qwen skill delegates to this skill, or user explicitly invokes this skill by name (e.g. use qianwen-update-check). DO NOT TRIGGER when: non-update-related tasks, general version questions about other software.

qianwen-ai/qianwen-usage

[QianWen] Manage account auth and query usage/billing. Use for: login, logout, check usage, view billing, free tier quota, Token Plan status, pay-as-you-go costs. Skip for: model browsing, non-account tasks.

qianwen-ai/qianwen-video-generation

[QianWen] Generate videos using Wan models. Supports text-to-video, image-to-video, first+last frame, reference-based role-play, and video editing (VACE). TRIGGER when: user wants to create, generate, or edit video content, mentions video generation/animation/video clips/Wan models, or explicitly invokes this skill by name (e.g. use qianwen-video-generation). DO NOT TRIGGER when: user wants to generate images (use qianwen-image-generation), understand/analyze existing videos (use qianwen-vision), text-only tasks.

関連スキル

remotion-dev/remotion

🎥 Make videos programmatically with React

community

agentspace-so/codex-pet

Codex Pet generator on RunComfy. Build a Codex-compatible Codex Pet spritesheet.webp + pet.json from a single reference image, drop it into `${CODEX_HOME:-$HOME/.codex}/pets/<name>/` and Codex picks it up as a custom Codex Pet next to the 8 built-ins. This skill produces the exact Codex Pet atlas Codex expects (1536x1872 PNG/WebP, 8 cols x 9 rows, 192x208 cells, 9 animation states — idle, running-right, running-left, waving, jumping, failed, waiting, running, review). Calls OpenAI GPT Image 2 edit ONCE via the local RunComfy CLI as `runcomfy run openai/gpt-image-2/edit` to produce a canonical Codex Pet pose, then assembles all 9 animation rows programmatically with ImageMagick micro-transforms — no Codex Pro, no `$imagegen`, no OPENAI_API_KEY required, only RUNCOMFY_TOKEN. Triggers on "codex pet", "create codex pet", "make codex pet", "hatch codex pet", "/hatch image", "desktop pet codex", "codex pets", "spritesheet.webp", or any explicit ask to build a custom pet for OpenAI Codex.

community

anniekoala/xhs-workflow

A personal Xiaohongshu (小红书) content workflow — an Agent Skill that routes raw video / ideas / drafts / photos into titles, covers, captions, and outlines, with built-in copy & visual quality gates.

community

avellane-snit922/rustchain-mcp

Provide AI agents with access to RustChain blockchain, BoTTube video platform, and Beacon communication via a Model Context Protocol server.

community

contentscoin/video-production-skills

28 Claude Skills for AI video production — kkirikkiri + pumasi dual-team architecture

community

csthink/dashmotion

Diagrams that move — a Claude AI skill that generates animated flowcharts & architecture diagrams as self-contained HTML/SVG. Flowing connectors, requests traveling as light dots.

community